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Abstract 

Despite the broad appeal of missing data handling approaches that assume a missing at 
random (MAR) mechanism (e.g., multiple imputation and maximum likelihood estimation), 
some very common analysis models in the behavioral science literature are known to cause bias- 
inducing problems for these approaches. Regression models with incomplete interactive or 
polynomial effects are a particularly important example because they are among the most 
common analyses in behavioral science research applications. In the context of single-level 
regression, fully Bayesian (model-based) imputation approaches have shown great promise with 
these popular analysis models. The purpose of this paper is to extend model-based imputation to 
multilevel models with up to three levels, including functionality for mixtures of categorical and 
continuous variables. Computer simulation results suggest that this new approach can be quite 
effective when applied to multilevel models with random coefficients and interaction effects. In 
most scenarios that we examined, imputation-based parameter estimates were quite accurate and 
tracked closely with those of the complete data. The new procedure is available in the Blimp 
software application for macOS, Windows, and Linux, and the paper includes a data analysis 


example illustrating its use. 
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A good deal of methodological literature supports missing data handling methods that 
assume a missing at random (MAR) mechanism whereby the probability of missingness is 
unrelated to an incomplete variable’s scores after conditioning on the observed data (Little & 
Rubin, 2002; Rubin, 1976). Full information maximum likelihood estimation and multiple 
imputation are MAR-based methods that enjoy widespread use in behavioral science 
applications. When missing values are restricted to the outcome variable, maximum likelihood 
solutions abound in popular software packages (e.g., mixed modeling packages in SPSS, Stata, 
R, etc.) and are probably preferable because valid estimates are obtained by simply fitting the 
analysis model to the observed data (Little, 1992; von Hippel, 2007). For additive analysis 
models with incomplete explanatory variables (e.g., multiple regression, multilevel models with 
random intercepts), classic multiple imputation routines are similarly plentiful and effective 
(Schafer, 1997; Schafer & Yucel, 2002; van Buuren, 2012; Van Buuren, Brand, Groothuis- 
Oudshoorn, & Rubin, 2006), and maximum likelihood estimation can be implemented by 
specifying the predictors as random normal variables in structural equation modeling software 
such as Mplus (Muthén & Muthén, 1998-2017). 

Despite their widespread appeal and availability, classic missing data handling 
procedures are known to induce bias when applied to a broad class of single-level and multilevel 
regression models featuring interactive effects, polynomial terms, or random coefficients. For 
example, conventional imputation approaches typically invoke reverse regressions where the 
outcome variable predicts incomplete covariates. This specification is appropriate for additive 
models with normally distributed variables, but reverse regressions are often statistically 
incompatible with analyses that include interactive or non-linear terms because they may define 
an implausible distribution of missing values (Bartlett, Seaman, White, & Carpenter, 2015; Kim, 
Sugar, & Belin, 2015; Liu, Gelman, Hill, Su, & Kropko, 2014). This incompatibility is the 
source of biases reported in the literature (Bartlett et al., 2015; Enders, Baraldi, & Cham, 2014; 
Enders, Hayes, & Du, 2019; Grund, Ludtke, & Robitzsch, 2018; Seaman, Bartlett, & White, 
2012; Zhang & Wang, 2017). Although our focus is multiple imputation, it is important to note 
that problems associated with non-linearities are also germane to maximum likelihood estimation 
(Enders et al., 2014; Yuan & Savalei, 2014). 

A growing body of recent missing data research has on Bayesian approaches that tailor 


multiple imputation to a particular analysis model (Bartlett et al., 2015; Erler, Rizopoulos, 
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Jaddoe, Franco, & Lesaffre, 2017; Erler et al., 2016; Goldstein, Carpenter, & Browne, 2014; 
Kim, Belin, & Sugar, 2018; Kim et al., 2015; Zhang & Wang, 2017). Roughly speaking, these 
methods specify a distribution for the explanatory variables (e.g., a normal distribution), then 
selectively choose imputations from this distribution that fit well (1.e., produce a high likelihood) 
when evaluated in an analysis model with interactive or non-linear terms. Simulation results 
from the aforementioned studies suggest that Bayesian approaches to imputation offer substantial 
improvement over older missing data handling methods for interactive and non-linear effects 
(e.g., just-another-variable imputation, passive imputation). Because the Bayesian estimation 
explicitly incorporates the substantive analysis model, we refer to this approach simply as model- 
based imputation. 

Building on recent developments, the purpose of our paper is to outline and evaluate a 
model-based imputation procedure that correctly handles incomplete predictor variables in a 
wide range of single-level and multilevel regression models with non-linear effects (e.g., 
interactions, polynomial terms, random coefficients). Our approach is related to other Bayesian 
imputation methods (e.g., substantive model-compatible imputation the sequential Bayesian 
approach) described in the literature (Bartlett et al., 2015; Grund et al., 2018; Ibrahim, Chen, & 
Lipsitz, 2002; Kim et al., 2018; Kim et al., 2015; Zhang & Wang, 2017)!. In particular, we 
extend recent work by Erler and colleagues (Erler et al., 2017; Erler et al., 2016) and Goldstein et 
al. (2014)? to accommodate general missing data patterns in data sets with up to three levels, 
including functionality for incomplete categorical variables. 

The structure of the paper is as follows. First, we discuss the issue of compatibility, as 
this helps clarify why conventional imputation approaches fail when applied to analysis models 
with interactive or non-linear effects. Second, we provide an overview of model-based 
imputation in the context of a single-level moderated regression analysis. Third, we outline an 
extension of model-based imputation that accommodates data structures with up to three levels, 


and we then show how to accommodate categorical variables in the procedure. Fourth, we report 


' Model-based imputation for single-level regression models is available in the R packages ‘smcfcs’ (Bartlett & 
Keogh, 2018) and ‘mdmb’ (Robitzsch & Ltidke, 2018) and in dedicated Bayesian analysis packages such as 
OpenBUGS (Zhang & Wang, 2017) and JAGS , among others. 

> The two-level imputation procedure from Goldstein et al. (2014) is available in the REALCOM (Carpenter, 
Goldstein, & Kenward, 2011) software and the R packages ‘jomo’ (Quartagno & Carpenter, 2018) and ‘mdmb’ 
(Robitzsch & Liidke, 2018). 
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the results from three simulation studies that evaluate the proposed procedure. Finally, we use 
the Blimp application to apply model-based imputation to a real data analysis. 
Compatibility of the Analysis and Imputation Models 

An important concern with multiple imputation is whether the distributions implied by an 
imputation procedure match those induced by the analysis model. This issue, known as 
compatibility, has roots in mathematical statistics (Arnold, Castillo, & Sarabia, 1999, 2001; 
Arnold & Press, 1989) and is a topic of recent interest in the missing data literature (Bartlett et 
al., 2015; Carpenter & Kenward, 2013; Hughes et al., 2014; Liu et al., 2014). 

To illustrate the concept of compatibility, consider a linear regression analysis model 


with an incomplete predictor X. 


Vi = Bo + Bi i) + & 


(1) 
E, ~ N(0,02) 
A typical application of multiple imputation uses a reverse linear regression that to define a 
distribution of missing values, given the outcome variable. 
Xi =VYot 10%) + ej 
(2) 


Xigmisy ~ N(Yo + 1.0%), 82) 


Together, analysis and imputation form a set of conditional models, and these conditional models 
imply a set of conditional distributions — the analysis model assumes that Y is normal given_X, 
and the imputation model assumes that X is normal given Y. Compatibility is concerned with 
whether a set of conditional models and their corresponding distributions relate to one another in 
a coherent way. 

The formal definition of compatibility given by Arnold and colleagues (Arnold et al., 
1999, 2001; Arnold & Press, 1989) and more recently by Liu et al. (2014)? and Bartlett et al. 


_;, 9;): 8; € 
pipe ij 

O,j=1.., p} is said to be compatible if there exists a joint model f{(x|@): 0 € ©} and a collection of surjective 

maps {t;: 60> 0;:j7=1, oP} such that for each j, 0; € ©; and @ € t; 1(6;) = {0: t;(@) = 6;}, we have 

gj (%;|x_;,8;) = fi (xjlx_;, 6). Otherwise, {gyi = 1,... pt is said to be incompatible.” 


3 Definition 1 of Liu et al., 2014, p. 160) states the following: “A set of conditional models {g j (x; |x. 
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(2015) is complex, so we sketch the basic ideas here and refer interested readers to these sources 
for additional details. Distilled at a basic level, the definition of compatibility says that a joint 
distribution exists, and the conditional distributions induced by this joint distribution are exactly 
the same as those we specify in our analyses. Returning to the previous example, the bivariate 
normal distribution has the property that its corresponding conditional distributions are also 
normal with constant variance. Thus, analysis and imputation models from Equations 1 and 2 are 
compatible because they are identical to those induced by a bivariate normal joint distribution. 
The practical implication of compatibility is that the imputation model should generate 
appropriate imputations for the analysis because the two models are functionally linked to a 
common joint distribution. 

Next, consider a moderated regression model (Aiken & West, 1991; Cohen, Cohen, 


West, & Aiken, 2002), examples of which abound in the applied literature. 


Vi = Bo + Bia) + Boi) + B31) ai) + €: G) 
Ee, ~ N(0, 07) 
Further, assume that X, and the product are incomplete. The moderated regression analysis 
assumes the outcome variable is conditionally normal, given the covariates and their interaction. 
However, the interactive effect in the analysis model precludes the possibility that X, is normal 
when conditioning on Y (Arnold et al., 1999, 2001; Arnold & Press, 1989; Bartlett et al., 2015; 
Liu et al., 2014; Sarabia, Castillo, & Arnold, 2001; Seaman et al., 2012). As such, an imputation 
model based on reverse linear regression is incompatible with the moderated regression because 
it specifies a distribution of missing values that is implausible given the interaction term in the 
analysis model. This incompatibility is the source of the biases noted in the literature (Bartlett et 
al., 2015; Enders et al., 2014; Kim et al., 2015; Seaman et al., 2012; Zhang & Wang, 2017). 
Model-based imputation attempts to remedy this problem by sampling imputations from a set of 
compatible models. 
Model-Based Imputation for Single-Level Regression 

So that readers can better understand our multilevel imputation scheme, this section 

summarizes model-based imputation for a single-level moderated regression analysis such as that 


in Equation 3. The methodology we describe here is closely related to substantive model- 
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compatible imputation and the sequential Bayesian approach from the literature (Bartlett et al., 
2015; Goldstein et al., 2014; Grund et al., 2018; Ibrahim et al., 2002; Kim et al., 2018; Kim et 
al., 2015; Liidke, Robitzsch, & West, 2019; Zhang & Wang, 2017)‘. Note that imputation relies 
on a set of regression models, the parameters of which are obtained via an iterative Bayesian 
estimation algorithm (the Gibbs sampler). We discuss the estimation steps later in the 
manuscript, but for now assume that the necessary quantities have been estimated. 

A variety of older imputation methods can be applied to the moderated regression 
analysis (Grund et al., 2018; Kim et al., 2018; Kim et al., 2015; Seaman et al., 2012; van Buuren, 
2012; van Buuren et al., 2018; Vink & van Buuren, 2013; von Hippel, 2009), including passive 
imputation (e.g., impute X, conditional on Y and Xz, then compute the product term 
deterministically) and just-another-variable imputation (e.g., treat the product of X; and X2 as 
variable to be imputed). The limitations of these approaches are well documented, so we refer 
interested readers to the literature for additional information (e.g., recent work by Kim and 
colleagues provides a comprehensive evaluation of several imputation strategies; Kim et al., 
2018; Kim et al., 2015). 

The idea behind model-based imputation is to parameterize the imputation problem as a 
set of compatible univariate distributions, one of which aligns with the analysis model. To frame 
the procedure that we adopt throughout the manuscript, consider the analysis model from 
Equation 3. We motivate our procedure by applying the conditional probability rule to factor the 


joint distribution of the analysis variables as 
D(Y, X1,X2) = p(V|X1,X2) X p(X, X2) (4) 


where p(Y, X1, Xz) is the joint distribution, p(Y|X;, Xz) is the distribution of Y induced by the 
analysis model (1.e., a normal distribution, conditional on the covariates and their interaction), 
and p(X,, X2) is the joint distribution of the covariates. The above expression readily generalizes 


to a scenario with R covariates, in which case the factorization is p(Y,Xj,...,Xp) = 


P(Y|X;,...,Xp) X p(Xj, «.., Xp). 


4 The key differences among these procedures are how they partition the covariate distribution and whether they 
treat complete covariates as random variables. Our approach specifies all predictor variables as normally distributed 
variables, regardless of whether they are incomplete. 
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Although it is not the only way to do so, we assure that imputation models arising from 
Equation 4 are mutually compatible by specifying a multivariate normal distribution for the 
explanatory variables. That is, we do not allow non-linear relations among covariates. Although 
conceptually similar to what we are doing, the so-called “sequential” parameterization of the 
joint distribution (Erler et al., 2017; Erler et al., 2016; Ibrahim et al., 2002; Liidke et al., 2019) is 
somewhat more flexible in that it can accommodate non-linear relations among covariates. In our 
view, specifying linear relations among the covariates is not a substantial practical limitation 
because most substantive researchers would not have a theoretical basis for specifying non-linear 
relations among variables that would otherwise have been treated as fixed in a complete-data 
analysis. 

To impute X,, we must derive its conditional distribution given the other analysis 
variables. Applying Bayes theorem gives the expression from Bartlett et al. (2015) and Kim et al. 
(2015), among others. 


D(XLIY, X2) « pV |Xy, X2) X p(X1|X2) (5) 


A benefit of assuming multivariate normality for the covariates is that the joint distribution of 
dependent variable and covariates must exist (a critical component of compatibility) when the 
analysis model is specified as a linear regression with normal errors and constant variance, as 


follows (Armold et al., 1999, 2001; Arnold & Press, 1989; Liu et al., 2014). 


X15 = Yo + ¥1 (X25) + 6; 
; (6) 
e; ~ N(0,02) 
Because X, appears in both terms on the right side of Equation 5 — once as a covariate 
and once as an outcome — the posterior distribution of missing values has a complex form that 


depends on the product of two normal distributions. 


P(X1|Y,X2) « N(Bo + By i) + Bo 2) + Bs O11) 21), 07) X No + ¥1 ai), 02) (7) 
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We can derive the exact distribution of missing values by comparing the product of the two 
normal kernels to the form of a univariate normal distribution and matching powers of X,, which 


gives an expression algebraically equivalent to the one in Kim et al. (2015, p. 1878). 


P(X|Y, X2) = 


N 2, (Bi + B3X2i) Vi — Bo — Box2i) + OF Yo + Y1%2i) Of O¢ (8) 
o¢ (By + B3X2))? + o? 02 (By + B3X2i)? + 02) 


Equation 8 shows that the mean and variance of the incomplete variable are non-linear functions 
of the other variables and two sets of model parameters. Comparing the correct conditional 
distribution to the misspecified one implied by a linear model with constant error variance (e.g., 
one similar to Equation 2) underscores the problem with standard imputation schemes. 

To perform imputation, a Bayesian estimation sequence (discussed later) generates the 
coefficients and residual variances for the regression models (i.e., B, 02, y, and a2), after which 
the algorithm samples X, imputations from the distribution in Equation 8. The product is then 
computed by multiplying the resulting imputations by the corresponding observed values of X>. 
Kim et al. (2015) show that computing the product in this manner is equivalent to drawing X, 
and its interaction term as a pair. In the general situation with more than one missing covariate, a 
comparable distribution can be derived for p(X2|Y, X,) — all that is needed are the model 
parameters from the regression of Xz on X,. Finally, missing outcome scores do not require a 
special procedure, as an imputation model with the same form as the analysis model generates 
replacement values (i.e., replacement values are drawn from a normal distribution with mean and 


variance equal to By + B1%1; + BoX2i + B3xX1;X2; and 02, respectively). 


Model-Based Imputation for Multilevel Regression Analyses 
Having illustrated model-based imputation in the context of a familiar single-level 
regression analysis, we now extend the procedure to multilevel regression models with missing 
values at any level. To keep the ensuing discussion as simple as possible, we describe the 
procedure for a two-level analysis, but the extension to three levels is straightforward. To 
provide an analytic context, consider a two-level random coefficient analysis with a pair of level- 


1 covariates and a single level-2 explanatory variable 


MODEL-BASED IMPUTATION 


Vij = Bo + Bi (x1ij) + Ba (xis) + Bs (Xj) + Boj + bry (x14) + &j 


be 9) 
b ~ MN(0,%,) €i; ~ N(0, 02) 
1j 


where the f’s are fixed effects, bo; and b,; are random intercept and random slope residuals, 
respectively, for level-2 cluster j, and €;; is a within-cluster residual for observation 7 in cluster /. 
For simplicity, we do not center predictor variables in the analysis model, but the resulting 
imputations can be grand or group mean centered in the subsequent analysis (Enders & Tofighi, 
2007; Kreft, de Leeuw, & Aiken, 1995). 

This model, which features an interaction between a manifest variable and a random 
effect (i.e., latent slope variable), has been the focus of recent missing data research (Enders et 
al., 2019; Enders, Keller, & Levy, 2018; Grund, Ludtke, & Robitzsch, 2016; Grund et al., 2018; 
Kunkle & Kaizer, 2017; Liidke, Robitzsch, & Grund, 2017). A standard method for imputing X, 
is to specify a “reverse random coefficient model” that features Y as a random slope predictor of 


ee 


xij = Vo + V1 (Vij) + V2(Xeij) + V3 (%3j) + Voy + Urj(Viz) + i 
U . 
(u,’) ~MN(0,E,) e; ~ N(0,02) (10) 


X1ij¢misy ~ No + ni(vij) sae (x2i;) ag ¥3(%3;) + Ug; + ur; (Vij), Oe) 


Consistent with the moderated regression analysis, this reverse regression model is incompatible 
with the analysis model because it incorrectly assumes that the conditional distribution of X, 
given Y is normal. As such, it gives imputations that are implausible given the random 
coefficient in the analysis model. Later in this section we show that the correct (compatible) 
conditional distribution is a complex non-linear function similar to that in Equation 8. 

To begin, we specify a multivariate normal distribution for the explanatory variables 
because this ensures that we can derive a set of mutually compatible imputation models using 


pairs of regression models, e.g., p(Y|X1, ...,Xp) X D(X,|Xq, Xp) Xp) +» Xp). Further, we 


10 
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divide the R covariates into a set of P predictors at level-1 and a second set of QO covariates at 
level-2. Note that auxiliary variables enter both conditional models in the same way as the 
substantive covariates, so we do not differentiate substantively-motivated predictor variables 


from auxiliary variables. The joint distribution for a two-level analysis model is 
x) ~ MN(w;,%,) x? ~ MN(w,E2) (11) 


(1) 


where x; ~~ is a P-element vector of level-1 scores for observation i, 4; is the corresponding a P- 


element vector of latent cluster means (i.e., random intercepts) for group /, x) 


j is an R-element 


between-cluster score vector that includes the P latent group means in pt; and the Q level-2 
covariate scores, ft contains the R grand means, XZ, is a P by P within-cluster covariance matrix, 
and Z, is an R by R between-cluster covariance matrix. For simplicity, we focus on two-level 
analyses, but the extension to three levels is straightforward. In this case, the covariate 


distribution becomes 
x) ~ MN(Hjx,1) x; ~ MN(#y,%2) xy ~ MN(u, Es) (12) 


where k indexes the third level. The online supplemental document gives a description of the 
three-level procedure. Note that Equations 11 and 12 are slightly different from recent related 
work that treats complete covariates as fixed (Erler et al., 2017; Erler et al., 2016). 

Returning to the random coefficient analysis from Equation 9, the predictors follow a 


multivariate normal distribution 


Xai ay 
U 
()) ~ MN (Hy Es) (1) ~ MN(#,22) (13) 
ij X3; 
where is = (x1ij, X2i;), Hj — (1), Ho;)s x2) = (us), inj, %3j)s h= (4, Ha, 3), xy isa2 by 2 
J 


within-cluster covariance matrix, and Z, is a3 by 3 between-cluster covariance matrix. 


Importantly, the cluster means in ft; can be represented as arithmetic averages of the level-1 


11 


MODEL-BASED IMPUTATION 


scores, or they can be modeled as normally distributed latent variables (Ltidke, Marsh, 
Robitzsch, & Trautwein, 2011; Liidke et al., 2008; Shin & Raudenbush, 2010). In this context, 
latent group means are just random intercepts from a multilevel model and should not be 
confused with latent means from a factor analysis model. Following the recommendation of 
Grund, Liidke, and Robitzsch (2017), we use latent cluster means because these quantities 
readily accommodate unequal group sizes, whereas taking arithmetic average of level-1 scores 
assumes balanced data (Carpenter & Kenward, 2013; Grund et al., 2017; Resche-Rigon & White, 
2018)°. The online supplement gives the full conditional distribution of the latent cluster means 
for our two-level example, and we point interested readers to Keller, Du, and Enders (2019) for a 
detailed treatment of fully conditional specification imputation with latent means. 

Next, we must derive the conditional distribution of X;. given all other predictors, X_, = 
(Xy, 1) Xp) Xai) +» Xp). Applying Bayes’ theorem gives p(X,|Y,X_,) « p(Y|X;, X_,) X 
p(X,|X_,), which is a more general expression for Equation 5. The multivariate normality 
assumption for the covariates again ensures that specifying each p(X;-|X_,-) as a linear regression 
with normal errors and constant variance yields mutually compatible imputation models (Arnold 
et al., 1999, 2001; Arnold & Press, 1989; Liu et al., 2014). Returning to the random coefficient 
analysis example, the within-cluster associations in Z, can equivalently be expressed as a pair of 


regression models 


Xyij = May + V11(X2i; — Hs) + yi; 


(14) 
eri; ~ N(0,02,) 


X2ij = Haj + Yo1(%ij = H1;) + ei; (15) 
eri; ~ N(0, 02;) 
where the leading subscript on the slope coefficient indexes the outcome variable. Three points 
are worth noting. First, because predictor variables are centered at their latent group means, y;; 
and y2, are “pure” within-cluster regression slopes (Kreft et al., 1995; Raudenbush & Bryk, 


2002). Second, the group means, 1, ; and (42 ;, are normally distributed latent variables that are 


5 The Blimp application described later can implement latent or manifest group means, with latent cluster means set 
by default. In our extensive test simulations, we have rarely observed meaningful differences between the two 
methods. 
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identical to random intercepts (1.e., random effects). Finally, although neither equation appears to 
condition on X3, this variable does in fact influence X, and Xz indirectly via their cluster means, 
My; and [2 ;. The equations below illustrate this point. 

Next, the between-cluster variances and covariances in Zz can be modeled with the 


following set of level-2 regression models. 


Maj = by + M11 (Ha; = Ha) at N12 (x3; — Hs) +04 


(16) 
61j ~ NCO, G7.) 

Hej = M2 + Naa (Hr; — ;) mt N22(%x3; — liz) a oF (17) 
62; ~ N(0,07,) 

x3; = Hz + N31 (Hay — Ha) + N32(He; — 2) + 3; (18) 


Two points are worth noting. First, we include regressions for the latent group means (i.e., 
random intercepts) because these quantities are integral to level-1 and level-2 imputation and 
must be estimated at every iteration of estimation. Second, the explanatory variables in each 
equation are again centered, such that the grand means function as fixed intercepts. Here again, 
our procedure is different from previous work (Erler et al., 2017; Erler et al., 2016; Goldstein et 
al., 2014) because we explicitly model the level-1 and level-2 parts of all predictors, treating the 
latter terms as normally distributed latent variables. 

Having defined the necessary regressions, we can now derive the posterior distribution of 
the missing values in the same manner as we did for a single-level analysis. To illustrate, 
consider the random slope predictor X,. As before, X,’s distribution has a complex form that 
depends on the product of two normal distributions (and two sets of model parameters)°. 


19 
N(XANV.X5.X0) « N(VIX.. X5.X5) & N(X1X,. X>) oe 


® The generic probability notation may not convey the fact that X, conditions on X; via its latent group mean in the 
between-cluster part of the model (see Equation 16). We could instead write the conditional distribution as 
D(X IY, Xo, X3, Xi), Xow) © P(Y |[X1,X2,X3,X1(By, Xocwy) X P(X1|Xz, Xz, X1 (By, X2(py ), where Xy(gy and Xpcgy are 
the between-cluster parts of the covariate. 
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x N(tuaj + V11(%2ij — Ua;), Ge) 


Because x,;; 1s conditionally normal given the other predictors, we can derive the exact 
distribution of the missing values by comparing the product of the two normal kernels to the 


form of a univariate normal distribution and matching powers of X,, as follows. 


de. (B: +b, + Baxs;) (v1 — Bo — bo — BoX2ij — B3X3;) + 0? (uj + (x2 ij + H2,j)) 


2 
Ox (B: +b, + BX3;) Oe 
Re: (20) 
0; Oe, 


2 
oé, (B: +b, + BX3;) Ge 


X1ij(mis) = N 


Examining the correct conditional distribution highlights the fact that the reverse random 
coefficient imputation model from Equation 10 is incompatible with the analysis model (e.g., 
because it assumes constant variance, whereas the correct variance is a non-linear function). 
Also, the previous expression highlights that missing values condition on all variables in the 
analysis, not just the covariates. For this reason, we would expect the procedure to accommodate 
a range of MAR processes that depend on the analysis variables (e.g., missingness induced by 
the outcome). 

Analogous distributions can be derived for the other explanatory variables, although the 
form of each distribution will generally depend on the level at which a covariate is measured as 
well as its role in the analysis. For example, consider the level-2 predictor X3. Because this 
variable is constant for all observations in level-2 cluster /, its conditional distribution features a 


product over all observations in that group. 
P(X3|Y,X1,X2) « p(Y |X1,X2,X3) X p(X3|X1, X2) 
nj 
x“ | |» ((Bo a boj) + (B sa bij )X1ij Bs Bo(x2i;) a B3(x3;),02 ) (21) 
i=1 
x N(us3 te 1131 (1; = My) oc 1132 (Ho; = U2), o7,) 


In practice, it is more straightforward to use a Metropolis sampling step (Hastings, 1970) to draw 


imputations from p(Y|X;, X_,) X p(X,|X_,) because this approach can approximate a target 
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distribution such as Equation 20 by simply evaluating candidate imputations in both normal 
likelihood functions. This eliminates the need to derive exact distributions for every unique 
application. As described in the next section, the Metropolis algorithm is one computational step 
in the Gibbs sampler that generates parameter values, random effects, and latent group means. 
Interested readers can find the technical details for the Metropolis sampler in the online 
supplemental material. 

A brief sidebar about centering is warranted given its important role in multilevel 
analyses. We use latent group mean centering in the covariate models to partition explanatory 
variables into within- and between-cluster components, and we do not impose any structure on 
their covariance matrices (e.g., the association between X4;; — M1; and x2;; — U2; need not be the 
same as that between 1; and /42;). For simplicity, we did not center predictors in the analysis 
model. If the substantive goal seeks to disentangle within- and between-cluster influences of a 
level-1 covariate, latent group means can also be added to the Y part of the imputation model 
(e.g., 4; and 2; could be specified as level-2 predictors of Y) and the resulting imputations can 
then be centered at their grand or group means (Enders & Tofighi, 2007; Kreft et al., 1995). 

Gibbs Sampler Algorithm for Bayesian Estimation 

Model-based imputation requires parameter values and random effect estimates for the 
substantive model and a set of parameter values and latent group means (i.e., random intercepts) 
for each explanatory variable. We use an iterative Bayesian estimation algorithm known as the 
Gibbs sampler to generate these quantities at each computational cycle, and a final Metropolis 
within Gibbs step draws incomplete covariate scores from their target distributions. The 
Bayesian paradigm views the model parameters, random effects, latent group means, and 
missing values as random variables that have a joint distribution, and the Gibbs sampler 
estimates one quantity at a time, drawing values from a probability distribution that conditions 
on a prior distribution and the current values of all other variables. In the interest of space, we 
sketch the major algorithmic steps here and refer interested readers to the online supplemental 
material for a technical exposition of full conditional distributions (e.g., the form and parameters 
of each distribution, default and user-specifiable priors). 

The full cadre of estimation steps for a two-level model with continuous variables is 
given below. The first five steps generate estimates for the substantive analysis model, and the 


remaining steps target explanatory variables. To simplify the presentation, we index the entire set 
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of covariates as r= 1, ..., R, such that y < P corresponds to either a level-1 observation or its 
corresponding level-2 group mean, and r > P refers to a manifest level-2 variable. The 


computational steps for a single iteration ¢ are as follows. 


1. Draw regression coefficients in B® from p(B| :) 
2. Draw the residual variance of © from p(o2|°) 


3. Draw random effects bY from p(B,| -) 

4. Draw the random effect covariance matrix ©) from p(Zp| -) 
5. Draw missing outcome scores Vous from p(Y |X,, X_,) 

6. Draw latent cluster means wo from D(U, jl -) forr=1toP 


7. Draw the grand mean Te from p(u,| +) forr=1toR 
8. Draw within-cluster regression coefficients in yo? from p(y,| +) forr=1 to P 


9. Draw the within-cluster residual variance an” from p(oz | -) forr=1toP 


(t) 


10. Draw between-cluster regression coefficients 9,” from p(77,| +) forr=1 to R 


11. Draw the between-cluster residual variance gee from p(oz.| -) forr=1toR 


12. Using a Metropolis sampler, draw missing covariates hs from p(Y |X,, X_,) X 


p(X, |X_,) forr=1toR 


The dot after the vertical pipe conveys the idea that the entire set of variables being conditioned 
on are fixed at their current values (e.g., the first step draws the substantive model’s regression 
coefficients from a distribution that conditions on random effects, variance components, and 
imputed data from the previous iteration). Finally, note that a given model or variable may only 
require a subset of these steps. For example, if the outcome variable is complete, steps 1 through 
4 are needed for the posterior distributions of the covariates, but imputation step 5 is omitted. 
Categorical Variables 

Thus far we have focused on continuous explanatory variables, but model-based 

imputation readily accommodates incomplete ordinal and nominal variables. We use a 


cumulative probit model for ordinal variables and a multinomial probit model for nominal 
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responses (Agresti, 2012; Albert & Chib, 1993; Carpenter & Kenward, 2013; Jiao & van Dyk, 
2015; Johnson & Albert, 1999; Mcculloch & Rossi, 1994). In the interest of space, we describe 
the procedure for binary and ordered responses here and take up the multinomial probit model in 
a separate work. Additional details on probit imputation schemes (also known as latent variable 
imputation) are widely available in the literature (Asparouhov & Muthén, 2010; Carpenter & 
Kenward, 2013; Enders et al., 2018; Goldstein, Bonnet, & Rocher, 2007; Goldstein, Carpenter, 
Kenward, & Levin, 2009). 

Probit regression imagines discrete responses arising from an underlying normal latent 
variable distribution (Agresti, 2012; Albert & Chib, 1993; Johnson & Albert, 1999). We denote 
the discrete and latent versions of covariate r as X,. and X;, respectively. The probit model 
defines the underlying latent variable as a z-score, with the linear predictor from a regression 
model defining the center of the distribution and the variance fixed to establish a scale. For a 
binary covariate, the model additionally incorporates a threshold parameter, x, that divides the 
latent distribution into two segments, such that X; is below the threshold when X;. equals zero 
and above the threshold when X,. equals one. This threshold parameter, which is typically fixed 
at zero to avoid redundancy with the fixed regression intercept, can be viewed as the z-score 
cutoff, above which the discrete score changes from zero to one. The probit model for ordinal 
variables has an identical formulation but incorporates additional threshold parameters. For 
example, an ordered categorical variable with c = 1, ..., C response options requires C — 1 
threshold parameters, such that X, = c ifk,_, < X; < K-. In this situation, the first threshold is 
still fixed at zero, but the remaining thresholds are updated at each iteration of the Gibbs 
sampler. We use the Metropolis-Hastings step described by Cowles (1996) for this purpose 
because it converges more quickly than other algorithms (Albert & Chib, 1993). 

To illustrate imputation for categorical explanatory variables, reconsider the random 
coefficient analysis from Equation 9, this time treating the level-1 covariate Xz as binary. We 
previously motivated model-based imputation by assuming a multivariate normal distribution for 
the explanatory variables because this ensures that we can derive mutually compatible 
imputation models. For categorical covariates, the normality assumption applies to the 
underlying latent scores. This necessitates a change to the level-1 normal distribution, where 


diagonal elements corresponding to the categorical variables are fixed at unity, as follows. 
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Xa: oz oO P 
(i) ~ MN(Hy, Es) z= ( % ae) (22) 


2ij OxX3x, 


Introducing latent variables changes the within-cluster regressions, which now model 


associations on the latent metric. 


Xyij = May + V11(%3ij aa Ho) + yi; 


(23) 
eri; ~ N(0,02,) 


X2ij = Maj + Yor (X1ij = Mj) + ei; (24) 
ex, ~ N(0O,1 — V219x,) 

Importantly, the residual variance in X>’s equation is no longer a free parameter but is a 
deterministic subtraction of explained variance from the total within-cluster variance’, which is 
fixed at unity in the joint distribution of the covariates (Equation 22). An alternate 
parameterization fixes the residual variance in Equation 24 to unity, which then defines total 
variance as | + 519%, - No changes are needed for the between-cluster part of the model, so 
Equations 16 to 18 also apply to this example. 

Given a full sample of latent variable scores, standard Bayesian estimation steps generate 
the parameter values, random effects, and latent group means required for imputation. The Gibbs 
sampler algorithm outlined in the previous section is augmented with an additional step that 
draws latent variable scores for each case, after which it performs the estimation steps treating 
the synthetic scores as real data. For cases with observed data, latent variable scores are drawn 
from a truncated normal distribution, such that observed scores of zero and one have latent 


scores below and above the threshold, respectively (¢.g., x3;; < if X2j; = 0, and x3;; 2 « when 
X2ij = 1). Robert (1995) describes an efficient approach for sampling from tails of a truncated 


normal distribution, but synthetic values can also be generated by repeatedly drawing values 


from a normal distribution until obtaining a score in the desired range. 


7 In Equation 28, o% , 1s the within-cluster variance of X, because latent group mean centering removes all between- 
cluster variance from the level-1 predictors. In the case of a level-2 or level-3 categorical variable, Ox, would reflect 
the total between-cluster variation at that level. With two or more predictors in a covariate model, y2, Ox, is replaced 
by the analogous matrix expression, y'Zyy, where Zy is the relevant within- or between-cluster covariance matrix. 
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For cases with missing data, the final Metropolis sampling step generates latent variable 
imputations from an unrestricted normal distribution, and it subsequently creates discrete 
imputes for the analysis model by comparing the continuous values to the threshold(s). As such, 
the posterior distribution of the missing values has a complex form that now depends on the 
product of two normal distributions and a function that reflects this categorization process. To 


illustrate, the posterior distribution of X is 


P(X2|Y,X1,X3) « p(¥|X1, X2,.X3) X p(X2|X1, X3) X p(X2|X2) 
o N ((Bo af boj) Fe (B: rr bij) Xj + B2X2i; + Bsx3;,02 ) 
x N (bp; a5 Voi (Mj aa Haj), 1 319%, ) 


x (1(xiiy = 0) (zi = 1) + 14 < 0) aij = 9) 


(25) 


where p(X3|X,,X3) is the distribution induced by the probit model from Equation 24, and the 
indicator functions that comprise the final term reflect the link between the latent and discrete 
imputes (1.e., X; = 1 if X¥; = k and X, =0 if X; < x). Consistent with the procedure for 
continuous variables, we use a Metropolis sampler to draw candidate pairs of latent and discrete 
imputations, retaining those that are likely to originate from the distribution in Equation 25. The 
technical details for the Metropolis step are given in the online supplemental materials. 
Simulation Study 1: 
Two-Level Random Coefficient Analysis 
This section describes the first of three Monte Carlo simulation studies used to evaluate 

model-based imputation. For this simulation, a random coefficient model with a single covariate 


at each level served as the population model. 

Vij = Bo + Bi(x1ij) + Bo(%2;) + Boj + Bij (rij) + &y (26) 
We generated 1000 artificial data sets within each cell of a design that varied five between- 
subjects factors: the intraclass correlation of the level-1 variables (ICC = .10 and .50), number of 


level-2 clusters (J = 30 and 100), within-cluster sample size (n; = 10 and 30), missing data rate 


(15% or 25% missing data on each predictor), and distribution shape of the level-2 predictor 
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(normally distributed versus a single-degree of freedom chi-square). These conditions routinely 
arise in behavioral science research (e.g., the low and high ICCs are typical of cross-sectional 
and longitudinal data, respectively; the sample size combinations are distributed around the 
30/30 rule-of-thumb from the literature), and the missing data rates are high enough to reveal 
biases for a reverse random coefficient imputation strategy (Enders et al., 2019; Enders et al., 
2018; Grund et al., 2016). 

We derived population parameters following variance decompositions given in Snijders 
and Bosker (2012, p. 116-117) and Rights and Sterba (2018). In particular, Rights and Sterba 
(2018) define variance-explained effect size measures that we used to derive meaningful 
parameter values for the simulation. In line with our treatment of explanatory variables (e.g., the 
normal distributions from Equation 11), these authors treat covariates as random variables that 
have within-cluster and between-cluster (i.e., level-1 and level-2) covariance matrices. This is 
convenient for modifying the intraclass correlations and defining variance explained effect sizes 
for the fixed and random parts of the model at each level. To establish a metric for the covariates, 
we constrained the within-cluster variance of X, at one and solved for the between-cluster 
variance that gives the desired ICC. The total variance of the level-2 predictor Xz was also set to 
one, and its correlation with the between-cluster part of X; was r = .30. Finally, we set the total 
variance of Y to 100 and solved for the within- and between-cluster variances that gave the same 
ICC as X,. These arbitrary constraints on the variances allowed us to specify effect sizes and 
solve for the corresponding model parameters. 

Applying expressions from Rights and Sterba (2018), we chose a value for the slope 
variance that explained 10% of the within-cluster variance in the outcome, and we derived the 
fixed level-1 regression slope that accounted for an additional 10% of this variance. Given these 
parameters, the residual within-cluster variance is fully determined. Moving to the level-2 
parameters, the slope variance and between-cluster variance of X, determine a portion of the 
between-cluster variance (this is a consequence of grand mean centering). Similarly, the 6, 
coefficient and the between-cluster variance of X, determine part the variance attributable to the 
fixed effects. Next, we solved for the Bz coefficient that explained an additional 10% of the 
level-2 variance, and we computed the residual (intercept) variance by subtracting out explained 
variance due to the fixed and random effects. Finally, we set the correlation between the random 


intercepts and slopes at r = .30. Our goal for the simulation was to implement effect sizes that are 
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meaningful to substantive researchers and large enough to expose potential problems with the 
imputation procedures. Equations 27 and 28 give the resulting population parameters for the ICC 


= .10 and .50 conditions, respectively. 


Vij =50+ 3.162(x4;;) + .744(X2i;) + boj + bi j(x1i;) + Eij 


boj 7.000 2.510 (27) 
a oo Osi iocto) ej ~ N(0,72) 
Vij = 50 + 3.162(x4;;) + 1.664(x25;) + Boj + dij (x14) + €i; 
(28) 


boj 35,000 5.612 
(7) ~ MNO. (Oe 19 10.000) €ij ~ N(0,40) 


All variables or terms on the right side of the population regression equation were first 
generated by sampling deviation scores from univariate or multivariate normal distributions with 
the desired variances and covariances. To simulate nonnormal data, we created a chi-square 
variate by squaring X> and rescaling it to have a zero mean and unit variance (on average) prior 
to inducing its between-cluster correlation with X,. This resulted in a variable with skewness and 
excess kurtosis values of approximately 2.0 and 9.0, respectively. After generating the predictor 
variables and residual terms, the outcome variable was computed as the weighted sum in 
Equation 27 or 28. 

We imposed a 15% or 25% missing data rate on both explanatory variables, such that 
missing values on X, and Xz were generated as a function of Y and the Y group means, 
respectively®. We used a logistic regression equation to link missingness probabilities to the 
outcome variable. Using a latent variable formulation for logistic regression (Agresti, 2012; 
Johnson & Albert, 1999), we derived intercept and slope coefficients that produced the desired 
missing data rate and a pseudo R? (McKelvey & Zavoina, 1975) value equal to .50 between the 
cause of missingness and the latent propensities. Finally, we sampled a missing data indicator for 


each observation (0 = observed, 1 = missing) from a binomial distribution with success rate 


8 In a second set of simulations not reported here, we created level-1 and level-2 auxiliary variables, A, and A9, that 
were responsible for missingness on X, and X>, respectively. We set the correlation between each covariate- 
auxiliary pair at approximately .50. The model-based imputation results were similar to those presented here, 
although listwise deletion performed much better since the MAR selection mechanism was much weaker. 
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equal to that observation’s missingness probability from the logistic regression model, and we 
deleted scores with indicator values of one. 

We used the Blimp 2.0 application (Enders et al., 2018; Keller & Enders, 2019) to apply 
reverse random coefficient imputation (1.e., conventional fully conditional specification) and 
model-based imputation’. The reverse random coefficient approach uses a model similar to that 
in Equation 10 to impute X,, and it uses the Y and X, cluster means (computed as arithmetic 
averages) as predictors of the missing Xz scores. The algorithmic steps for this version of fully 
conditional specification (Blimp offers others) are identical to invoking the mice.impute.21.pan 
and mice.impute.2lonly.norm functions in the R package MICE (van Buuren, 2011; van Buuren 
et al., 2018). The model-based approach is identical to the procedure described earlier except that 
the simulation model includes a single level-1 covariate rather than two. Similar model-based 
procedures for this particular random coefficient model can be implemented in specialized 
Bayesian analysis programs such as JAGS (Erler et al., 2017; Erler et al., 2016; Grund et al., 
2018; Plummer, 2016) and in the R packages jomo (Quartagno & Carpenter, 2018) and mdmb 
(Robitzsch & Liidke, 2018). 

After examining potential scale reduction factors (Gelman et al., 2014; Gelman & Rubin, 
1992) from several artificial data sets, we generated 10 imputations from a Gibbs sampler 
algorithm with 1000 burn-in and thinning iterations (i.e., imputed data sets were saved at 1000- 
iteration increments). We used the complete-data maximum likelihood estimator in Mplus 8 
(Muthén & Muthén, 1998-2017) to fit the random slope model to the multiply imputed data sets, 
and we wrote a custom R program to pool estimates and standard errors. To provide additional 
comparisons, we also report the complete-data (pre-deletion) and listwise deletion results. The 
online supplemental material also presents limited simulation results for the full information 
maximum likelihood estimator for missing data in Mplus, which can accommodate incomplete 
covariates with numerical integration. Computational tasks were executed on UCLA’s Hoffman2 
supercomputer, and we used a Linux shell script to coordinate simulation tasks. All simulation 
code is available upon request. 


Simulation Study 1 Results 


? We implemented the PRIOR2 options for the substantive analysis and the XPRIOR3 option for the covariate 
models. The technical appendix from the online supplement describes these priors in detail. 
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Although not part of our main simulation design, we began by examining the large- 
sample behavior of the imputation methods in data sets with 1000 level-2 clusters and 50 
observations per cluster. Figure | gives trellis plots displaying relative bias values (the difference 
between an average estimate and its true value expressed as a percent of the true value) for each 
combination of intraclass correlation, distribution shape, and missing data rate. As a rough 
heuristic, published simulations often define bias values less than 10% in absolute value as 
acceptable (Finch, West, & MacKinnon, 1997; Kaplan, 1988), so the figures display these 
thresholds as dashed lines. As seen in the figure, model-based imputation estimates were 
effectively indistinguishable from those of the complete data. Perhaps somewhat surprisingly, 
violating normality by including a skewed level-2 predictor had no material impact on parameter 
recovery. In contrast, there was no situation where fully conditional specification (reverse 
random coefficient imputation) produced uniformly accurate estimates, as the random slope 
variance was consistently underestimated by 10% to 20% (the covariance was also biased). 
Previous studies have also noted this bias (Enders et al., 2018; Grund et al., 2016), which is a 
consequence of incompatibility. Finally, listwise deletion estimates were uniformly and 
substantially biased. Because there is no reason to expect deletion to improve with smaller 
samples, we omit this approach from further discussion. 

Turning to the full stimulation design, Figures 2 and 3 give trellis plots displaying relative 
bias values with normally distributed predictors and 15% and 25% missing data, respectively. 
The online supplemental material gives a full set of graphical and tabular displays of the model- 
based estimates and their bias values. Considered as a whole, model-based imputation estimates 
tracked closely with those of the complete data, and the procedure was clearly preferable to 
reverse random coefficient imputation (fully conditional specification). The combination of ICC 
=.10, small sample size (30 clusters with 10 observations per group), and 25% missing data 
produced the largest bias values, but performance was still quite good for most parameters. For 
comparison, Figure 4 displays the relative bias values from the simulation conditions with a 
skewed level-2 predictor and 25% missing data rate (see the online supplemental material for a 
few set of graphical displays). Consistent with the large-sample simulation, violating normality 
at level-2 had little to no impact on parameter recovery. Because the results did not materially 
differ from those in Figures 2 and 3, no further discussion is warranted. Although imputation is 


our main focus, we also applied the full information maximum likelihood estimator in Mplus to 
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the normally distributed simulation data because this option would be widely available to 
substantive researchers. The maximum likelihood estimates were prone to large biases, 
particularly in the ICC = .50 conditions. These results, which appear in the online supplement, 
underscore that problems related to non-linear terms are not restricted to multiple imputation. 

The trellis plot in Figure 5 displays frequentist confidence interval coverage values for 
the 25% missing data rate condition. Because the other conditions were quite similar, we give the 
full set of plots in the online supplemental material. Coverage is the proportion of estimates 
where the 95% symmetric confidence interval included the true parameter, and the dashed lines 
at .925 and .975 correspond to Bradley’s (1978) so-called liberal criterion. Coverage values 
lower than the nominal 95% rate reflect Type I error inflation (e.g., a coverage value of 90% 
suggests a twofold increase in Type I errors), whereas values greater than 95% reflect 
conservative inference. We restrict our attention to the fixed effects because the literature argues 
that symmetric confidence intervals are inappropriate for variance estimates (Maas & Hox, 2005; 
Snijders & Bosker, 2012). As seen in the figure, coverage values for the level-2 slope coefficient 
were often too low (about 90%) in conditions with only 30 clusters, but the complete-data 
coverage rates exhibited the same pattern. 

Simulation Study 2: 
Two-Level Random Coefficient Analysis with a Categorical Predictor 

The second simulation study evaluated model-based imputation with a categorical 
explanatory variable. The random coefficient analysis from Equation 26 again served as the 
population model, but the level-2 covariate X2 was a dichotomous variable with equal category 
proportions, on average. Creating the binary variable required two small changes to the data- 
generating process, but the simulation design and procedures were otherwise identical. First, to 
derive the true population parameters, we set the variance of the underlying continuous X3 scores 
at .25, which is the same value we would expect from the binary variable. Second, after deriving 
the true parameter values, we generated continuous data and subsequently dichotomized X, by 
splitting the underlying continuous distribution at zero (the population mean). We again used 
Blimp 2 to implement conventional fully conditional specification and model-based 


imputation!®, respectively, and we used listwise deletion as an additional comparison. 


‘0 We implemented the PRIOR2 options for the substantive analysis and the XPRIOR3 option for the covariate 
models. The technical appendix from the online supplement describes these priors in detail. 
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Simulation Study 2 Results 

As a precursor to the full simulation study, we again examined the large-sample behavior 
of the imputation methods in data sets with 1000 level-2 clusters and 50 observations per cluster. 
The trellis plots in Figure 6 show that model-based imputation estimates were virtually unbiased, 
whereas reverse random coefficient imputation (fully conditional specification) consistently 
underestimated the slope variance by 10% to 20% of its true value. Because the full simulation 
produced results that were virtually identical to those from the first simulation study, we give the 
graphical summaries in the online supplement. The results can be summarized as follows: (a) 
model-based estimates largely tracked with those of the complete data, (b) the combination of 
small within-cluster sample size and low intraclass correlation produced the largest bias values 
(similar to those from Figures 2 and 3), and (c) parameter recovery for fully conditional 
specification was meaningfully worse. Finally, coverage values were comparable to those from 
Figure 5 (e.g., complete-data and imputation-based estimates of the level-2 slope coefficient 
were often too low). 

Simulation Study 3: 
Three-Level Analysis with a Cross-Level Interaction 

Thus far we have considered random coefficient models because they have been the 
focus of recent multilevel imputation literature. However, model-based imputation can 
accommodate a much broader range of interactive and non-linear effects. To illustrate its 
performance in a different context, the final simulation examined a three-level random 
coefficient analysis with a covariate at each level and a cross-level interaction involving a level-1 


and level-3 predictor. 


Vijk = Bo + Bi (x1ijx) + Bp. (x2 jx) + B3(X3x) + Bal Xrijx) 3x) 
+ Dor + Diz (Xrijx) + Boje + Daj (Xrijn.) + ij (29) 


For this analysis, the imputation procedure applies the multivariate normal distribution from 
Equation 12 to the covariates, which again induces a set of linear regression models at each level. 
The analysis model functions in the same way as it did before (1.e., defining the distribution of Y 
given the predictors and interaction). In the interest of space, the online supplement gives the 


posterior distributions of the missing data for this problem. 
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The simulation generated 1000 artificial data sets within each cell of a design that varied 
four between-subjects factors: the distribution of variance across levels (20% and 50% of the 
variation distributed across level-2 and level-3), number of level-3 clusters (K = 30 and 100), 
level-1 within-cluster sample size (1; = 10 and 50), and missing data rate for the covariates (15% 
or 25%). The two variability configurations featured 80% or 50% of Y and X,’s variance at 
level-1, with between-cluster variability distributed evenly across the two higher levels (we refer 
to these as the ICC = .20 and .50 conditions, respectively). In these same conditions, 80% or 
50% of X2’s variance was assigned to level-2 with the rest at level-3. The number of level-2 
clusters within each level-3 cluster was held constant at 5, resulting in a range of sample sizes 
between 1500 (30 level-3 clusters and 10 level-1 observations per level-2 group) and 25,000 
(100 level-3 clusters and 50 level-1 observations per level-2 group). As before, missingness on 
the covariates was imposed as a function of the within- and between-cluster parts of the outcome 
variable. Because this simulation is meant as a proof of concept under ideal circumstances, we 
do not investigate the impact of nonnormality. It was somewhat surprising that imputing a 
skewed predictor did not impact parameter recovery, but we would be hesitant to assume that the 
same holds true when the nonnormal variable is part of an interaction. Keller (2019) provides a 
thorough investigation of model-based imputation for multilevel interactive effects, and his work 
examines this issue. 

The effect size measures from Rights and Sterba (2018) readily extend to three-level 
models, so our data-generating process largely mimicked the previous simulations. As before, we 
set the covariate correlations to r = .30, and we specified fixed effects hierarchically, such that 
each coefficient incremented the explained variance at a particular level by 10%. Because 
interaction effects tend to be smaller in magnitude (Chaplin, 1991), we set the cross-level 
interaction coefficient to explain an additional 5% of the within-cluster variance. Finally, we 
identified random slope variances that explained 10% of the outcome’s variance at level-1, and 
we determined the residual variance at each level by subtracting out the explained portions of 
variance due to the fixed and random effects. Equations 30 and 31 give the resulting population 


parameters for the ICC = .20 and .50 conditions, respectively. 


Vije = 49.836 + 3.098( x4 jx) + -724( x2 jx) + -654(X34) + 1.549(x ijn )(X3x) (30) 
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+ Dox + Dix (X1ijx) + Dojx + Di jx (X1ijx) + Eijx 


bon 5.104 1.917), (bojx 5.500 1.990 
(;°" MNO.(1517 000) oo MN©(7'590 s000)) 


Eijk ae N(0,52.000) 


Vij = 49.740 + 2.449( x4 jx) + 1.445 (x2 j~) + 938 (x3%) + 1.225 (Xi jx) (X3K) 
+ Dox + Dix (X1ijx) + doje + Di jx (X1ijx) + Eijx 


bor 11.183 2.243), (Dojx 13.750 2.487 (G31) 
(nt) ~MNO.(Coas sD 2 MNO.(o 487 5.000) 


Eiji om N(0,32.500) 


We again used Blimp 2 to apply just-another-variable imputation (i.e., conventional fully 
conditional specification that treats the cross-level product as a variable to be imputed) and 
model-based imputation'!. We are unaware of other software packages that apply these 
approaches to three-level data. After examining potential scale reduction factors (Gelman et al., 
2014; Gelman & Rubin, 1992) from several artificial data sets, we generated 10 imputations 
from a Gibbs sampler algorithm with 1000 burn-in and thinning iterations (i.e., imputed data sets 
were saved at 1000-iteration increments). As before, we used Mplus’s complete-data maximum 
likelihood estimator to fit the analysis model to the multiply imputed data sets, and we wrote a 
custom R program to pool estimates and standard errors. Computational tasks were executed on 
UCLA’s Shared Hoffman2 Cluster, and we used a Linux shell script to coordinate simulation 
tasks. All simulation code is available upon request. 

Simulation Study 3 Results 

Figures 7 and 8 display average relative bias values for the fixed effects and variance 
components, respectively. In the interest of space, we focus on the 25% missing data rate and 
point readers to the online supplement for a full set of graphical displays. For clarity we omit 
listwise deletion from Figures 7 and 8, but these results are in the supplement. As seen in the 


figures, imputing the incomplete product term with fully conditional specification (i.e., just- 


'l We implemented the PRIOR2 options for the substantive analysis and the XPRIOR3 option for the covariate 
models. The technical appendix from the online supplement describes these priors in detail. 
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another-variable imputation) introduced substantial biases that did not dissipate as sample size 
increased (e.g., the interaction and level-3 slope coefficients were 20% to 40% lower than their 
true values, and a number of variance estimates were also distorted). In contrast, model-based 
imputation estimates were generally quite accurate, with most bias values well below 10%. 
Estimating this model with only 30 level-3 units gave negatively biased estimates of the level-3 
slope and the interaction coefficient, but increasing the number of groups to 100 effectively 
eliminated this issue. Variance estimates were generally quite accurate, and increasing either the 
within-cluster sample size at level-1 (e.g., from 10 to 50) or the number of level-3 groups 
improved parameter recovery. Not surprisingly, listwise deletion estimates were uniformly and 
severely biased (see the online supplement). 

Finally, Figure 9 displays frequentist confidence interval coverage values for the 25% 
missing data rate condition, and the online supplemental material gives a full set of plots. Model- 
based coverage values were almost always within Bradley’s liberal bounds. The complete-data 
coverage values for the level-3 slope coefficient and the interaction coefficient were lower 
(worse) in some cases (e.g., with 30 level-3 groups), which presumably occurred because 
missing data uncertainty widened confidence intervals and counteracted the complete-data 
estimator’s natural tendency for under-coverage. 

Real Data Example 

This section illustrates model-based imputation in the context of a cluster-randomized 
trial of a novel math problem-solving intervention (Montague, Krawec, Enders, & Dietz, 2014). 
The data set, which is available on the Blimp website, features three levels: 6874 repeated 
measurements at level-1 nested in 982 students at level-2, and students nested in 29 schools at 
level-3. Schools were randomly assigned to an intervention (novel curriculum) or control 
(standard curriculum) condition, such that all students within a given school received the same 
treatment. To keep the example relatively simple, we ignore nesting at the school level and focus 
on a two-level model (in fact, there was a relatively small proportion of variation at the school 
level). The analysis is a two-level growth curve model with a cross-level interaction involving 
condition and months (i.e., the group-by-time interaction). In addition to an intervention dummy 
code and its interaction with the temporal predictor, the substantive analysis model features a 


time-varying math self-efficacy rating scale and a lunch assistance dummy code as covariates. 
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probsolv,; = Bo + B, (mathef f,,) + B2(month0;;) + B3(frlunch;) 
+ B,(condition;) + B;(month0,;)(condition;) + bo; (32) 
+ b,;(mathef f,,) + bz; month0;;) + e¢; 


The percentage of missing observations for each incomplete variable were as follows: problem- 
solving (11.5%), self-efficacy (11.45%), lunch assistance status (4.7%). A substantial proportion 
of problem-solving and self-efficacy scores resulted from planned missingness where the control 
group was assessed bimonthly instead of monthly. Regardless of mechanism, time-varying 
variables were imputed by fixing the corresponding temporal predictor at the planned assessment 
dates. 

Appendix A gives the Blimp 2 script for model-based imputation. To add the third level 
of nesting, one would simply list the school-level identifier on the CLUSTERID line (to our 
knowledge, Blimp is the only application that can apply this procedure to three-level data 
structures). The Blimp 2 user guide (Keller & Enders, 2019) provides a detailed description of 
the scripting language, including a number of new conventions and commands that differ from 
its predecessor. A byproduct of our procedure is that the software also gives Bayesian estimates 
(e.g., posterior means and standard deviations) as optional output, so we provide these results as 
a comparison to illustrate a simple sensitivity analysis. We also used Blimp to implement a 
second model-based imputation procedure that incorporates gender (complete), standardized 
math scores (4.7% missing), and the school-level percentage of non-native English speakers 
(complete) as auxiliary variables (these additional variables are added to the MODEL line, and 
gender must be declared as ORDINAL). Finally, although theoretical and computer simulation 
results generally argue against it, we also included fully conditional specification. The procedure 
should achieve near-optimal performance in this example because the interacting variables are 
complete. The software is available as a free download for macOS, Windows, and Linux at 
www.appliedmissingdata.com/multilevel-imputation, and the full set of analysis scripts and the 
data are available from the same URL. 

As a Starting point, it is important to recall that Blimp specifies a multivariate normal 
joint distribution for the explanatory variables (or their latent scores, in the case of discrete 
predictors), and the software does not allow users to specify non-linear relations (e.g., quadratic 


terms or random coefficients) among pairs of covariates. The so-called sequential version of 
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model-based imputation outlined by Ibrahim et al. (2002) and more recently by Erler and 
colleagues (Erler et al., 2017; Erler et al., 2016) offers this flexibility, but this approach is not yet 
available for three-level data structures. Assuming normality for explanatory variables simplifies 
imputation because the user need only specify the substantive analysis on the MODEL line, and 
the software automatically constructs the appropriate covariate models. The cross-level product 
term in the analysis model is specified by joining the interacting variables with an asterisk (e.g., 
month0O*condition), and the random coefficients are specified by listing the random 
predictors to the right of the vertical pipe (e.g., MODEL: .. | matheff month0O;).Ina 
three-level model, Blimp would automatically include random coefficients at level-2 and level-3. 

Executing the script in Appendix A generates imputations for the lower-order variables 
that are consistent with the specified interaction effect, but the product term is not added to the 
filled-in data sets. Rather, Blimp saves uncentered lower-order variables so that users can apply 
group mean or grand mean centering (or leave variables uncentered) prior to computing 
interactions. This is in contrast to fully conditional specification, which treats the interaction as a 
variable to be imputed. In this framework, it may be necessary to work with uncentered variables 
then rescale the imputed product term to approximate a centered solution (Enders et al., 2014). In 
addition to its theoretical and empirical benefits, model-based imputation is highly convenient 
because it allows researchers to apply familiar procedures for probing interaction effects (Aiken 
& West, 1991; Bauer & Curran, 2005). 

Blimp can print a table of potential scale reduction factor diagnostics (Gelman & Rubin, 
1992), and it optionally saves parameter values that can readily be converted to trace plots in 
other software (e.g., an R plotting script is provided with the other files for this example). The 
potential scale reduction factors suggest that the Gibbs sampler converges in approximately 2000 
iterations (i.e., across all models and all parameters, the highest potential scale reduction factor is 
approximately 1.05). Based on this information, we requested 20 imputations from a Gibbs 
sampler with 4000 burn-in and 2000 thinning iterations (i.e., we saved the first data set after 
4000 computational cycles and saved additional data sets every 2000 iterations thereafter). The 
job takes approximately one minute on a 2018 10-core iMac Pro and about two minutes on a 
2017 two-core Macbook Pro. The resulting imputations are compatible with all major analysis 
packages (the SAVE command outputs imputations in stacked format or as separate files), and 


the set of files for this example includes analysis scripts for Mp/us, R, SPSS, and Stata. 
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Table 1 gives the parameter estimates from the analysis, with random effect covariances 
omitted from the table for brevity. The key finding is that the intervention-by-time interaction 
coefficient is positive and significant, meaning that students in the intervention schools exhibited 
more rapid problem-solving gains than students in control schools. The Bayesian slope variance 
estimates are slightly larger than those of the imputation procedures (these quantities can be 
viewed as estimates taken across more than 30,000 imputations), but differences were generally 
slight. In particular, fully conditional specification estimates were not dramatically different from 
those of model-based imputation. As noted previously, fully conditional specification should 
achieve its optimal performance in this example because the random predictor (months since 
baseline) and the interacting variables are complete. Theoretical and simulation results suggest 
that this would not be true in general. 

Discussion 

Despite the broad appeal of multiple imputation and other MAR-based approaches, a 
broad class of regression models featuring interactive effects, polynomial terms, or random 
coefficients are known to cause bias-inducing problems for popular missing data handling 
procedures (Bartlett et al., 2015; Enders et al., 2014; Seaman et al., 2012; Zhang & Wang, 2017). 
A growing body of recent missing data research has focused on fully Bayesian multiple 
imputation methods that are appropriate for interactive and non-linear effects (Bartlett et al., 
2015; Erler et al., 2017; Erler et al., 2016; Goldstein et al., 2014; Kim et al., 2018; Kim et al., 
2015; Zhang & Wang, 2017). Building on these recent developments, this paper outlined a 
model-based multiple imputation methodology designed to handle a wide range of interactive 
and non-linear effects in single-level and multilevel regression models with up to three levels. 
This procedure offers a number of compelling advantages: it (a) has a strong theoretical 
foundation in the Bayesian framework, (b) readily extends to three-level data structures, (c) uses 
latent variables (i.e., random effects) to model between-cluster variation and covariation, (d) 
readily accommodates categorical variables, and (e) produces Bayesian analysis results (e.g., 
posterior means and standard deviations) as a byproduct of estimation. The primary downside of 
the procedure is that covariates can exert non-linear or random influences on the outcome but not 
each other. The so-called sequential approach to fully Bayesian imputation (Erler et al., 2017; 


Erler et al., 2016; Ibrahim et al., 2002) can accommodate certain patterns of non-linearities, but 
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this approach has not been extended to the range of applications that we consider here (e.g., 
three-level models, categorical variables). 

Computer simulation results suggest that model-based imputation is quite effective when 
applied to multilevel models with random coefficients and interaction effects. In most scenarios 
that we examined, estimates tracked closely with those of a complete data analysis and either 
reduced or eliminated biases associated with conventional approaches such as fully conditional 
specification imputation (and maximum likelihood estimation, although our investigation of this 
procedure was quite limited). These improvements were particularly salient for a model with a 
cross-level interaction term. 

Importantly, model-based imputation in Blimp assumes that covariates are multivariate 
normal. It was somewhat surprising that imputing a skewed predictor did not impact parameter 
recovery, but we are hesitant to assume that the same holds true when the nonnormal variable is 
part of an interaction. Further, an obvious avenue for future research is to examine the impact of 
non-normal level-1 covariates. Keller (2019) provides a thorough investigation of model-based 
imputation for multilevel interactive effects, and his work examines these cases. Second, the 
normality assumption implies that covariates are linearly related, and we did not investigate 
scenarios where covariates are non-linearly related. At least for two-level models, it is possible to 
implement a comparable sequential decomposition of the covariate distribution (Ibrahim et al., 
2002) in dedicated Bayesian analysis software such as JAGS (Erler et al., 2017; Erler et al., 
2016; Grund et al., 2018). Because that approach allows some covariates to be non-linear 
functions of others, it could be more robust to normality violations than our method (Liidke et 
al., 2019). In practice, we suspect that researchers would rarely have the information needed to 
correctly specify such non-linearities, but this alternative is important to consider. 

This paper was an initial foray and is necessarily limited in scope and generalizability. 
First, we investigated a small subset of non-linear effects that are possible. The framework can 
readily accommodate three-way and higher interactions, polynomial effects, and combinations of 
interactive and polynomial terms. Virtually nothing is known about the application of model- 
based imputation to these models, and a great deal of research is needed to clarify the 
procedure’s limitations. Second, our paper offered a very limited glimpse into categorical 
variable imputation. Simulations conducted while developing Blimp suggest that latent variable 


imputation can work well in a wide range of situations, but the procedure is almost certainly 
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sensitive to the number and type of categorical variables, the distribution of response options, 
and sample size, to name a few. Finally, the simulation conditions we examined were necessarily 
limited in scope, and multilevel models offer myriad possibilities for manipulating sample sizes, 
intraclass correlations, effect sizes, and number of random effects. 

In sum, our paper outlined a new imputation approach for multilevel models with 
interactive or non-linear effects. Limited computer simulations suggest that model-based 
imputation can offer substantial improvement over conventional imputation methods and 
maximum likelihood estimation. The Blimp application offers a user-friendly environment for 
implementing model-based imputation, and the software’s website has a number of resource 


materials, including analysis scripts for all major software platforms. 
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Appendix A 
Blimp Syntax for Applying Bayesian Estimation and 
Model-Based Imputation with the Real Data Example 


DATA: ~/desktop/example.dat; 
VARIABLES: school student wave condition eslpct 


ethnic male frlunch achgroup stanmath montho 


month? probsolv matheff condbymonth; 

ORDINAL: frlunch condition; 

CLUSTERID: student; 

MISSING: 999; 

MODEL: probsolv ~ matheff monthO frlunch condition 
monthO*condition | matheff montho; 


SEED: 90291; 


NIMPS: 20; 
BURN: 4000; 
THIN: 2000; 


CHAINS: 10 processors 10; 


OPTIONS: estimates latent psr; 


SAVE: stacked = ~/desktop/imps.csv; 
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Table 1 


Parameter Estimates from the Real Data Analysis Example 


Bayes MBI MBI+ AV FCS 


Parameter Mean SD Est. SE Est. SE Est. SE 


Fixed Effect Coefficients 


Intercept 46.468 0.594 46.516 0.577 46.541 0.587 46.613 0.571 
Self-Efficacy Slope 0.511 0.055 0.508 0.054 0.499 0.056 0.498 0.053 
Monthly Change Slope 0.406 0.043 0.410 0.039 0.405 0.039 0.406 0.042 
Lunch Assistance Slope -0.916 0.306 -0.933 0.285 -0.899 0.290 -0.925 0.290 
Condition Slope -0.467 0.270 -0.474 0.267 -0.441 0.266 -0.487 0.272 
Months by Condition Slope 0.330 0.053 0.326 0.051 0.331 0.051 0.331 0.054 
Level-2 (Student-Level) Variance Components 
Intercept Variance 33.877 8.268 30.225 = 7.853 31.993 8.120 30.862 8.034 
Self-Efficacy Slope Variance 0.447 0.105 0.400 0.098 0.408 0.100 0.394 0.100 
Months Slope Variance 0.137 0.032 0.129 0.033 0.128 0.032 0.128 0.032 
Level-1 Residual Variance 12.225 0274 12.206 0.347 12.209 0.346 12.238 0.348 


Bayes = Bayesian posterior summary, MBI = model-based imputation, MBI + AV = model-based imputation 
with auxiliary variables, FCS = fully conditional specification. 


Figure 1, Average relative bias values from the large-sample simulation featuring a random 
coefficient model with either a normal or skewed level-2 predictor. The dashes represent bias 
values of + 0.10. FCS = fully conditional specification (“reverse random coefficient” 


imputation), MBI = model-based imputation, LWD = listwise deletion. 


Intercept 
Level—1 Slope 
Level—2 Slope 

Intercept Var. 
Icept—Slope Cov. 
Slope Var. 


Residual Var. 


Intercept 
Level—1 Slope 
Level—2 Slope 

Intercept Var. 
Icept—Slope Cov. 
Slope Var. 


Residual Var. 


Intercept 
Level—1 Slope 
Level—2 Slope 

Intercept Var. 
Icept—Slope Cov. 
Slope Var. 


Residual Var. 


Intercept 
Level—1 Slope 
Level—2 Slope 

Intercept Var. 
Icept—Slope Cov. 
Slope Var. 


Residual Var. 


@ Complete 1 FCS + MBI VY LWD 


afi oat 
op | 
aan: i 
areeGueel aan 
o—— =e 
iparanenlianre 
EERE EEE 
| 
at = 


Percent Relative Bias 


SSOUBUISSIPT % ST 
QCLIVAOD Z—[OATT [PULION 


SSOUBUTSSTPT % ST 


QCLIVAOD Z—[OATT [PULION 


SSOUBUTSSIPAT % ST 
OVIUVAOD Z—[OAIT PAMoyS 


SSOUBUTSSTPT % ST 
QVIUVAOD Z—[OAVT POMayS 


Figure 2. Average relative bias values from the simulation featuring a random coefficient model 
with normally distributed predictors and 15% missing data. The dashes represent bias values of + 
0.10. FCS = fully conditional specification (“reverse random coefficient” imputation), MBI = 


model-based imputation. 
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Figure 3. Average relative bias values from the simulation featuring a random coefficient model 
with normally distributed predictors and 25% missing data. The dashes represent bias values of + 
0.10. FCS = fully conditional specification (“reverse random coefficient” imputation), MBI = 


model-based imputation. 
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Figure 4. Average relative bias values from the simulation featuring a random coefficient model 
with a skewed level-2 predictor and 25% missing data. The dashes represent bias values of + 
0.10. FCS = fully conditional specification (“reverse random coefficient” imputation), MBI = 


model-based imputation. 
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Figure 5. Confidence interval coverage for the fixed effects from a random coefficient model 
with 25% missing data. The dashes at .925 and .975 represent Bradley’s (1978) so-called liberal 
criterion. FCS = fully conditional specification (“reverse random coefficient” imputation), MBI 


= model-based imputation. 
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Figure 6. Average relative bias values from the large-sample simulation featuring a random 
coefficient model with an incomplete binary level-2 predictor. The dashes represent bias values 


of + 0.10. FCS = fully conditional specification (“reverse random coefficient” imputation), MBI 


= model-based imputation, LWD = listwise deletion. 
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Figure 7. Average relative bias values for the fixed effects from the three-level simulation 
featuring random coefficients, a cross-level interaction, and 25% missing data. The dashes 
represent bias values of + 0.10. FCS = fully conditional specification (“just another variable” 


imputation), MBI = model-based imputation. 
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Figure 8. Average relative bias values for the variance components from the three-level 
simulation featuring random coefficients, a cross-level interaction, and 25% missing data. The 
dashes represent bias values of + 0.10. FCS = fully conditional specification (“just another 


variable” imputation), MBI = model-based imputation. 
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Figure 9. Confidence interval coverage for the fixed effects from the three-level simulation 
featuring random coefficients, a cross-level interaction, and 25% missing data. The dashes at 
.925 and .975 represent Bradley’s (1978) so-called liberal criterion. FCS = fully conditional 


specification (“just another variable” imputation), MBI = model-based imputation. 
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MODEL-BASED IMPUTATION 


Online Supplemental Material 


Craig K. Enders, Han Du, and Brian T. Keller 


e Section A: Bayesian estimation steps and full conditional distributions for model- 
based imputation 

e Section B: Description of model-based imputation for 3-level computer simulation 
model 

e Section C: Trellis plots displaying relative bias and confidence interval coverage 


from all conditions of the computer simulation. 
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A. MCMC Sampling Steps and Distributions for Two-Level Imputation 
This document gives technical details of the two-level Gibbs sampler, 
specifically the full conditional distributions used to draw model parameters, random 
effects, latent means, and missing values for model-based imputation in Blimp. 
Gibbs Sampler Steps for the Analysis Model 
In this section we abandon the scalar notation from the manuscript in favor of 


a more succinct matrix representation of the multilevel model 


where y,; is the vector of outcome scores for cluster j, X; is the corresponding matrix 
of predictor variables (level-1 or level-2), including a unit vector for the intercept, Z, 
is a subset of the level-1 variables in X,; that have a random influence on the outcome 
(e.g., a unit vector and any random coefficient predictors), b; is the column vector of 
level-2 residuals for cluster j, and €; is a vector of within-cluster residuals. A variety 
of sources give the full conditional distributions for this model (Browne, 1998; Browne 
& Draper, 2000; Enders et al., 2018; Lynch, 2007; Schafer & Yucel, 2002; Yucel, 
2008), which we summarize here for completeness. 

To illustrate the following steps more concretely, we will refer to the following 
substantive model, which includes within-cluster and cross-level interaction effects. 
Between-cluster interactions involving pairs of level-2 variables are also possible, but 
it should become evident that the composition of the analysis model has no bearing 


on the covariate models. 
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Viz = Bo + Bi (£143) + By (X9,;) + Bs (x3;;) es (2443) (©2i;) + Bs (x4;) 
+ Be (X5;) + Br (a5) (55) + B05 + O15 (raz) + Eig (SA2) 


bo; 2 
b ~ MN(0, xy) Eig ~ N(0, a2) 


1j 


Step 1: Draw regression coefficients from p(§| +) « p(Y|8,b;, 4,02, X )p(8). 
Assuming a uniform prior, p(9) « 1, the full conditional distribution is a multivariate 


normal distribution. 


B= (> xx, ; 9 Xj (y; — Z;b;) (SA3) 


j=l j=l 
J -1 

By =08 (3-303; 
j=l 


Applied to the model from Equation SA2, the X matrix includes a unit vector for the 
intercept and a column for each explanatory variable and interaction term, the Z 
matrix is comprised of a unit vector and Xj. 

Step 2: Draw random effects for cluster 7 from a multivariate normal 


distribution. 


V,, = (07? ZZ, +551) (SA4) 
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Step 3: Draw the residual variance from p(o2| +) « p(Y|8,6,,¥,,02,X )p(o2). 
We define 1/o? as a gamma random variable and draw the reciprocal of the residual 


variance (i.e., the precision) from a gamma distribution 


Ned 7, 75) 


J 
oe Sele, (SA5) 


with hyperparameters df, and S,, for the prior distribution. The default setting in 
Blimp specifies S,, = 1 and df,, = 2, which corresponds to a gamma(1,.5) prior. Two 
other options are to set S, = 0 and df, = —2 (the PRIOR2 keyword of the OPTIONS 
command) and S,, = 0 and df, =0 (the PRIOR3 keyword), a Jeffreys prior. 

Step 4: Draw the between-cluster covariance matrix variance from p(™,|-) « 
p(Y|B,b;, ,,02, X)p(X,). We define the inverse of the covariance matrix (i.e., the 
precision matrix, ;') as a Wishart random variable. The level-2 precision matrix is 
sampled from a Wishart distribution, conditional on the current parameter estimates, 


level-2 residuals, and imputations. 


Est ~ W((S+ $51)", J + df,) (SA6) 


S,, can be viewed as the inverse of the prior sums of squares matrix based on df, 


degrees of freedom (i.e., prior observations). As such, S + 5. is a sums of squares 
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and cross products matrix based on J + df, observations. The default prior sets Se" 
= Iand df, =p +1, where p is the dimension of 4%. This prior corresponds to 
marginal uniform priors between -1 and 1 for all correlations and a marginal inverse 
gamma prior IG(1,.5) for variance elements. Specifying the PRIOR2 keyword of the 
OPTIONS command sets SS" = 0 and df, = —p — 1, which is equivalent to a uniform 
prior on the elements in %,. Finally, the PRIOR3 keyword sets S35: = 0 and df, = 0. 
For random intercept models with a single level-2 variance component, we draw the 
reciprocal of the variance, 1/07, from a gamma random variable with analogous 
univariate priors based on p = 1. 

Step 5: Draw the imputation for observation 7 in cluster 7 from a univariate 


normal posterior distribution. 


Yij(mis) — N(x,,6 oe 2,;5;, oa?) (SA7) 


Gibbs Sampler Steps for the Covariate Model r 

To convey the estimation steps in the most general way possible, we introduce 
new notation that differs from that in the manuscript. To begin, index the P level-1 
predictors as p = 1, ..., P, and index the Q level-2 predictors as q = 1, ..., Q. As 
explained in the paper, each level-1 variable has a regression model for the 
observations and a regression model for its latent cluster means. Thus, level-1 
estimation involves P computational cycles, and level-2 estimation requires R = P + 
Q computational cycles. To simplify the notation, we index the entire set of variables 
as r= 1, .... R, such that r < P corresponds to either a level-1 observation or its 


corresponding level-2 group mean, and r > P refers to a manifest level-2 variable. 
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Using generic notation, the level-1 and level-2 regression models are given 


below. 
Dy iy = My 5 a L_ nig Vr a Craig 
(SA8) 
Crag ~ N(O, a2) 
Uy = by aE E_»jMp a ee 
. (SA9) 
ae ~ N(O, a) 
where x,.;; is the level-1 score for covariate r, £_,,,; denotes the P — 1 row vector of 


all other level-1 predictor variables except r, centered at their latent group means. 


Turning to the between-cluster regression, the outcome, ~,. ;,, is either a latent group 


rp 
mean (¢.g., %,; = H,,;) when r < P or a manifest level-2 variable when r > P, and 
x_,.; is a R—1 row vector of grand-mean centered level-2 variables other than r. For 
some of the Gibbs sampling steps, it is convenient to concatenate observation-level 
quantities into matrices. For example, the N-row vector of level-1 outcome scores is 
x,;; and the corresponding N by P— 1 matrix of centered level-1 predictors is pe 
Similarly, the J-row vector of level-2 outcome scores is x,; and the J by R matrix of 
mean-centered predictors is are 

To make the notation more concrete, consider the analysis model from 


Equation SA2. The multivariate normality assumption induces the following level-1 


regression models. 


Ligg = bag + V1 (X94; = [19;) + 12 (235; = L3;) + €1;; 
Lai3 = May + Yar (X14; — L1;) + Yoo (x35, = [135) + 945 (SA10) 
@3ij; = Maz + Y31 (@rig — Hay) + Y32 (Pig — Maz) + €3ij 
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For any given model, @,.;; is N-row vector of outcome scores, 7, is a 2-element vector 
of level-1 regression coefficients, and Xs is the N by 2 matrix of latent group mean 
centered predictors on the right side of each equation. The multivariate normality 


assumption also induces the following level-2 regression models. 


Myj = My + M1 (He; = 
Haj = Me + Noy (M4; — by) + Noe (U3, — b3 


33 = bg + N31 (M1; 7 C34 (SA11) 


= ) ) ) 
Day = ba + May (ory — Ma) + a2 (Ma; — ba) + Nag ©3; — 3) + Naa (255 — Ls) + C4; 
@55 = Ms +151 (Haj — ) ) ) 


For any given model, 2, ; is the J-row vector of outcome scores (latent means or 
manifest variables), 7, is a 4-element vector of level-2 regression coefficients, and 


Xx: is the J by 4 matrix of grand mean centered predictors on the right side of each 


equation. 


Step 6. If variable r is measured at level-1 (i.e., r< P), draw its latent cluster 


means from P( Mp5 ‘) x P(X, |My, Mr js Vrs cae X_,.)P (Mp jl, Vy 5 Ui oz), which is the 


univariate normal distribution below. 


p(y | -) —N oe. ae ~ Li; Vr) + oe. (1, ate Z_,..;1,) oO. oF (SA12) 
ve a2 +102 "Oe Ny Oe 


Note that 0? and ot residual variance for covariate r’s observations and latent 
k ‘ 


means, respectively. 
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Step 7: Draw the grand mean of variable r from p(u,.| +) « 
P(X ples Mer FZ, Hrjr Yrr Fe, Xr D(H)» Assuming a uniform prior, p(u,) oc 1, the full 


conditional distribution is a univariate normal distribution. 


(SA13) 


where J the number of level-2 units. As noted previously, x,.; is a latent cluster mean 
when variable r is measured at level-1, and it is a score when r is at level-2. 

Step 8: If variable r is measured at level-1 (i.e., r< P), draw its within-cluster 
regression slopes from p(7,|*) « p(X,|Hp; ps Oe Pope X_,)p(7,). Because latent 
group means replace the regression intercept, the level-1 regression requires that the 
outcome (i.e., a level-1 score) is centered at its current group mean from step 9. In 
line with our previous notation, we denote the N-row vector of centered outcome 
scores as £,,;; Assuming independent uniform priors, p(7,.) « 1, the full conditional 


distribution is a multivariate normal distribution. 


P(y,|*) = MN(4,, Bs, ) (SA14) 
™ Ss iy 
on (X!,,ij X15) Xj rij (SA15) 
om ~~ = 
uy = Oe: (Xing x3) (SA16) 


Note that it is not necessary to account for clustering here because all scores are 


centered at their latent group means. 
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Step 9: If variable r is measured at level-1 (i.e., r< P), draw the within-cluster 
residual variance from p(o2 | +) 0 p(Xy.|Mps Mrs OZ, 5 Mj Yrs Fe,,1 Xp )P(O2,)- First, define 


a level-1 residual as follows 


OT ee al er a (SA17) 


and stack these residuals into an N-row vector define a N-row vector é,.;;. We define 
1/o? as a gamma random variable and draw the reciprocal of the residual variance 


(i.e., precision) from a gamma distribution 


N+df, &.,é,,+8 
ior ~G ( ; fp Snare | (SA18) 


with hyperparameters df, and S,, for the prior distribution. The default setting in 
Blimp (XPRIOR1) specifies S, = 1 and df, = 2, which corresponds to a gamma(1,.5) 
prior. Two other options are to set S, = 0 and df, = —2 (the XPRIOR2 keyword of 
the OPTIONS command) and S,, = 0 and df, = 0 (the XPRIOR3 keyword), a 
Jeffreys prior. Our simulation used a Jeffreys prior, but we have found that the 
default setting prevents between-cluster variances from collapsing when covariates 
have very low intraclass correlations. 

Step 10: Draw between-cluster regression slopes for covariate r from p(m,.| +) « 
P(X, [ps Mes On ae Cee X_,.)p(n,). Because the grand mean replaces the fixed 
regression intercept, the level-2 regression requires the outcome (i.e., a latent group 
mean or level-2 score) to be centered at its grand mean from step 6. In line with our 


previous notation, we denote the J-row vector of centered outcome scores as &,, ;. 
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Assuming independent uniform priors, p(7,.) « 1, the full conditional distribution is a 


multivariate normal distribution 


p(n,|*) = MN (h,, Ba, ) (SA19) 
~ tall — SL laa, ~ 
| (Xj X_.3) Dire rj (SA20) 
a = 
da, = of. (X22, X-3;) (SA21) 


Step 11: Draw the between-cluster residual variance for covariate r 
from p(o2 | +) x p(X, [Mrs Mes 0,1 My.js Vrs Fe, 1 X_,) P(OZ_). First, define a J-row vector of 


level-2 residuals as 


A 


Cr. = L,.5 +: Ly, ma X_. i, (SA22) 


where 1 is a J-row unit vector. To reiterate, x, ; contains latent group means (r < P) 
or manifest level-2 variables (r > P). We define 1/oZ as a gamma random variable 
and draw the reciprocal of the residual variance (i.e., the precision) from a gamma 


distribution 


(SA23) 


oz w@ (Lt Si6rat Sp 
Sr 2 2 


with hyperparameters df, and S,, for the prior distribution. The default setting in 
Blimp (XPRIOR1) specifies S, = 1 and df, = 2, which corresponds to a gamma(1,.5) 
prior. Two other options are to set S, = 0 and df, = —2 (the XPRIOR2 keyword of 


the OPTIONS command) and S,, = 0 and df, = 0 (the XPRIOR3 keyword), a 
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Jeffreys prior. Note keywords induce the same priors at all levels of the data 
hierarchy. 

Step 12: Draw missing values from p(X,.) « p(Y|X,., X_,.) x p(X,|X_,.). For 
each missing score, the Metropolis algorithm draws a candidate imputation 


XxX, 


r(candidate) from a normal proposal distribution 


x 


r(candidate) a 


N(x”? 


o ) 
r(current)? ~ (proposal) 


(SA24) 


where the mean X\ is the current imputation 7 at iteration t, and the variance 


r,i(current) 


2 


(proposal) 18 Chosen to optimize the acceptance rate of the candidate imputations. We 


2 
(proposal) 


2 
(proposal) 


have found that setting o = 9(07_) for level-1 variables and a = 
2.25 (2 ) for level-2 variables tends to give optimal acceptance rates, although the 
Blimp application adaptively tunes the spread of the proposal distribution by 
increasing or decreasing the constant multiplier at regular intervals during the burn- 
in phase in an attempt to achieve an acceptance rate for the imputations between 
0.25 and 0.45 (Gelman et al., 2014; Johnson & Albert, 1999; Lynch, 2007). For each 
incomplete variable, Blimp checks the acceptance rate every 50 iterations by 
computing the proportion of accepted imputations for a particular variable across all 
incomplete observations during the 50-iteration interval (e.g., for 10 incomplete cases, 
the acceptance rate for a particular variable is the proportion of the 500 draws that 
are accepted). If the acceptance rate does not fall between 0.25 and 0.45, the program 
increases (if the acceptance rate is too high) or decreases (if the acceptance rate is too 
low) the variance multiplier. Once the burn-in iterations are complete, tuning checks 


are turned off. Normal proposal distributions are routinely used in Bayesian analysis 


texts (Gelman et al., 2014; Lynch, 2007), but a strong rationale for adopting this 


10 
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distribution in our context is that the correct conditional distribution is, in fact, 
normal (see manuscript Equations 8 and 20). Thus, the Metropolis algorithm provides 
a way to model the true distribution of missing values without deriving the complex 
non-linear functions that define its mean and variance. 

After drawing a candidate imputation from the proposal distribution, the 
Metropolis algorithm calculates the natural logarithm of an importance ratio (IR) 
that quantifies the height of the target density evaluated at the candidate imputation 


proportional to its height when evaluated at the current imputation. 


In(IR.) = lin[py |X, a a Cplcandiatieys a ,Xp)| ae In[p(Xreanaiate)|X—»)]| 
(SA25) 
_ lin [p(Y |X, ae A nidanpeniys 1) Xp) + In [p(X yeurrent|X-r)]] 


Note that p(Y |X,...,X,, , Xp) and (Y |[X,,...,X,, Xp) involve 


candiate)? *** current)? ***? 


the product of likelihoods (or the sum of log likelihoods) when X,, is at level-2; 
P(X, (candiate)|X—>) and p(X, (current)|X—-) always evaluate a single observation. The 
importance ratio defines the probability of a Bernoulli random variable that 
determines whether the candidate value is retained as the current imputation for the 
next iteration. If the importance ratio exceeds unity, the candidate imputation is 
automatically accepted. If the ratio is large but less than one, the candidate 
imputation is likely to be accepted because it has a high probability of originating 
from p(Y|Xp) x p(X,|X_,.). As the ratio decreases, so too does the chance of retaining 
the candidate value because it is unlikely to originate from the target density. To 
account for the natural logarithm, the Metropolis sampler draws a random number u, 
t 


from a uniform distribution U(0,1) and accepts X. () 


r(candidate) as the new current 


imputation for the next iteration t + 1 if In(IR) > In(u;, ). 
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For incomplete categorical variables, the Metropolis sampler computes the 


importance ratio as follows 


= DY Xs sere ce oy sas »Xp)P( Xi candiate |X —r) 
p(Y|X,,...,X. Xp)p(X* [x ; 


r(current) 


IR 


(SA26) 


r(current)? °°") 


where X}(.andiate) 18 a candidate imputation on the underlying latent variable metric. 
Consistent with the procedure for continuous variables, the algorithm draws a 
candidate latent variable score from a normal proposal distribution, and the current 


and candidate synthetic scores, Xr ) and X* 


r(candiate)? respectively, are then 


current 
evaluated in the p(X*|X_,.) components of the ratio. The corresponding discrete 


candidate X,,, ) for the first component of the numerator product is generated 


candiate 


by comparing the latent candidate to the threshold parameter, such that X,, 


candiate) 


= Oif er neva’ <« and Aer leaniarate) =1if AG eriynitate) = Kk. 
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Model-Based Imputation for a Three-Level Model 


The three-level analysis model for the third computer simulation is as follows. 


Yijk = Bo + Pi (145%) + Bo (24) + 83 (%3;,) + By (1 4;%) (3x) 
+ box + Org (rage) + Doge + Or je (@rige) + Sige (SB1) 


io 
i) ~ MN(O, x, ) ta ~ MN(O, Xy,,) ijk ~ N(O, oz) 


a 
Dik Ljk 


Auxiliary variables enter the model as additional explanatory variables. 


The covariate distribution is multivariate normal 


x) ~ MN(j4,8,) 2) ~ MN(p,,8,) 2 ~ MN(u, 5s) (SB2) 


where a!) = (Zrigk)» Mik = (Hrjr), a?) = (Haje>Toje)s Be = (Mie: Hor): a?) = 
(Likes Mon, U3y) Me = (Ly, fl, Ms), S44 is a scalar within-cluster variance, X4 is a 2 by 2 
between-cluster covariance matrix at level-2, and 4, is a 3 by 3 between-cluster 
covariance matrix at level-3. 

The covariate regression models are as follows, and analogous regressions are 


constructed for the latent variable cluster means. 


Die — figs + Cine 
C1ijk ~ N(0, a.) 
Loin = Hop + (L155 — Han )Mo1 + Cojk (SB3) 
Con ~ N(0,0%,) 
Lap = Mz + (Mig — My )O31 + (Mog — He )O32 + 13x 
Tee N(0, 07, ) 
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To make the following expressions more succinct, define y;,, as the linear 
predictor (i.e., predictor value) of Y for observation 7 in level-2 and level-3 clusters j 


and k, respectively. 


Vij = 2+ By (15%) + Be (54) + B3(@3;,) + Bg (X14;%) (3x) 
+ box + Ora (Lrign) + Boge + Or je (Liga) (SB4) 


The Metropolis-Hasting algorithm samples imputations from the following functions 


P(X,|Y, Xo, X3) « p(Y |X1, Xo, X3) x p(X1|Xo, X3) (SBS) 
x N (Gijns a2) x N (Hi jn %, ) 


p(XQ|Y, X,,X3) x p(Y |X,, X_, X3) x p(X2|X,, X3) 


SS a : ' (SB6) 
ox LL. Gian: oz) x N (Hox + (Haj — Hin) Mar o,) 
i=1 


P(X3|Y,X1,X_q) « p(Y |X,, Xq,X3) x p(X3|X1, Xo) 
4 ilk 


oS I] N (Gijts On) (SB7) 
i=1 


x N (3 + (Hig — M1 )O31 + (Hon — He) O32; 7.) 


where n,,; denotes all level-1 observations in a given cluster j (i.e., all observations 


with a particular X, score in common), and n,,),, represents the number of 
observations in a given level-3 cluster k (i.e., all observations with a particular X, 


score in common). 
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C. Full Results from Computer Simulation Studies 
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C. Full Results from Computer Simulation Studies 
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Large Sample Simulation 1: Average Estimates and Bias Values for Model-Based Imputation 
Parameter True Value Avsg. Est. 


ICC 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 


Missing % 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
25% 
25% 
25% 
25% 
25% 
25% 
25% 
25% 
25% 
25% 
25% 
25% 
25% 


Distribution 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Skewed 
Skewed 
Skewed 
Skewed 
Skewed 
Skewed 
Skewed 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Skewed 
Skewed 
Skewed 
Skewed 
Skewed 
Skewed 
Skewed 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Skewed 
Skewed 
Skewed 
Skewed 
Skewed 
Skewed 


Intercept 

Level-1 Slope 
Level-2 Slope 
Intercept Var. 


Icept-Slope Cov. 


Slope Var. 
Residual Var. 
Intercept 
Level-1 Slope 
Level-2 Slope 
Intercept Var. 


Icept-Slope Cov. 


Slope Var. 
Residual Var. 
Intercept 
Level-1 Slope 
Level-2 Slope 
Intercept Var. 


Icept-Slope Cov. 


Slope Var. 
Residual Var. 
Intercept 
Level-1 Slope 
Level-2 Slope 
Intercept Var. 


Icept-Slope Cov. 


Slope Var. 
Residual Var. 
Intercept 
Level-1 Slope 
Level-2 Slope 
Intercept Var. 


Icept-Slope Cov. 


Slope Var. 
Residual Var. 
Intercept 
Level-1 Slope 
Level-2 Slope 
Intercept Var. 


Icept-Slope Cov. 


Slope Var. 


50.000 
3.162 
0.744 
7.000 
2.510 

10.000 

72.000 

50.000 
3.162 
0.744 
7.000 
2.510 

10.000 

72.000 

50.000 
3.162 
1.664 

35.000 
5.612 

10.000 

40.000 

50.000 
3.162 
1.664 

35.000 
5.612 

10.000 

40.000 

50.000 
3.162 
0.744 
7.000 
2.510 

10.000 

72.000 

50.000 
3.162 
0.744 
7.000 
2.510 

10.000 


50.002 
3.163 
0.746 
6.978 
2.509 
9.988 

72.019 

50.006 
3.158 
0.716 
7.068 
2.508 
9.983 

71.984 

49.994 
3.169 
1.673 

35.004 
5.546 

10.027 

40.015 

50.020 
3.149 
1.633 

35.405 
5.621 

10.003 

39.989 

49.999 
al ee 
0.731 
6.975 
2.494 
9.993 

71.996 

50.011 
3.161 
0.726 
7.076 
2.509 
9.995 


Bias 
0.003 
0.020 
0.271 

-0.308 
-0.055 
-0.116 
0.027 
0.011 
-0.120 
-3.767 
0.977 
-0.090 
-0.168 
-0.023 
-0.012 
0.205 
0.533 
0.012 
-1.182 
0.271 
0.039 
0.040 
-0.406 
-1.822 
1.158 
0.158 
0.034 
-0.027 
-0.001 
-0.175 
-1.797 
-0.353 
-0.653 
-0.070 
-0.005 
0.022 
-0.039 
-2.470 
1.084 
-0.038 
-0.049 


0.1 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 


25% Skewed 
25% Normal 
25% Normal 
25% Normal 
25% Normal 
25% Normal 
25% Normal 
25% Normal 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 


Residual Var. 
Intercept 

Level-1 Slope 
Level-2 Slope 
Intercept Var. 


Icept-Slope Cov. 


Slope Var. 
Residual Var. 
Intercept 
Level-1 Slope 
Level-2 Slope 
Intercept Var. 


Icept-Slope Cov. 


Slope Var. 
Residual Var. 


72.000 
50.000 
3.162 
1.664 
35.000 
5.612 
10.000 
40.000 
50.000 
3.162 
1.664 
35.000 
5.612 
10.000 
40.000 


71.987 
50.000 
3.151 
1.646 
35.040 
5.029 
10.016 
40.009 
50.024 
3.157 
1.625 
39.315 
5.564 
10.006 
39.993 


-0.018 
0.001 
-0.359 
-1.085 
0.114 
-1.480 
0.164 
0.022 
0.049 
-0.183 
-2.343 
1.071 
-0.866 
0.063 
-0.016 


Simulation 1: Average Estimates and Bias Values for Model-Based Imputation 
Parameter True Value Avg. Est. 


N 
J =30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100, n = 30 
J =30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100, n = 30 
J =30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100,n = 30 
J =30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J =30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 


ICC Missing % 


0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 


15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
25% 
25% 
25% 
25% 
25% 
25% 
25% 
25% 
25% 
25% 
25% 
25% 
25% 


Distribution 


Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 


Intercept 

Intercept 

Intercept 

Intercept 

Level-1 Slope 
Level-1 Slope 
Level-1 Slope 
Level-1 Slope 
Level-2 Slope 
Level-2 Slope 
Level-2 Slope 
Level-2 Slope 
Intercept Var. 
Intercept Var. 
Intercept Var. 
Intercept Var. 


Icept-Slope Cov. 
Icept-Slope Cov. 
Icept-Slope Cov. 
Icept-Slope Cov. 


Slope Var. 
Slope Var. 
Slope Var. 
Slope Var. 
Residual Var. 
Residual Var. 
Residual Var. 
Residual Var. 
Intercept 
Intercept 
Intercept 
Intercept 
Level-1 Slope 
Level-1 Slope 
Level-1 Slope 
Level-1 Slope 
Level-2 Slope 
Level-2 Slope 
Level-2 Slope 
Level-2 Slope 
Intercept Var. 


50.000 
50.000 
50.000 
50.000 
3.162 
3.162 
3.162 
3.162 
0.744 
0.744 
0.744 
0.744 
7.000 
7.000 
7.000 
7.000 
2.510 
2.510 
2.510 
2.510 
10.000 
10.000 
10.000 
10.000 
72.000 
72.000 
72.000 
72.000 
50.000 
50.000 
50.000 
50.000 
3.162 
3.162 
3.162 
3162 
0.744 
0.744 
0.744 
0.744 
7.000 


49.987 
49.971 
49.991 
49 993 
3.064 
3.142 
3.119 
3.144 
0.680 
0.674 
0.732 
0.727 
6.203 
6.459 
6.792 
6.798 
2.162 
2.512 
2.358 
2.474 
9931 
9.761 
9.853 
10.008 
71.334 
71.853 
72.124 
71.880 
49.918 
49.960 
49.980 
49.963 
2.969 
3.116 
3.120 
3.170 
0.604 
0.668 
0.727 
0.711 
6.355 


Bias 
-0.025 
-0.059 
-0.018 
-0.014 
-3.096 
-0.657 
-1.358 
-0.569 
-8.639 
-9 382 
-1.565 
-2.317 

-11.388 
-7.730 
-2.971 
-2.883 

-13.859 

0.081 
-6.057 
-1.414 
-0.686 
-2.389 
-1.465 

0.075 
-0.925 
-0.204 

0.172 
-0.167 
-0.164 
-0.079 
-0.041 
-0.073 
-6.098 
-1.450 
-1.322 

0.245 

-18.871 

-10.216 
-2.286 
-4.410 
-9.218 


J = 30,n=30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100,n = 30 
J =30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J =30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100, n = 30 


0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 


25% 
25% 
25% 
25% 
25% 
25% 
25% 
25% 
25% 
25% 
25% 
25% 
25% 
25% 
25% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 
15% 


Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 
Normal 


Intercept Var. 
Intercept Var. 
Intercept Var. 


Icept-Slope Cov. 
Icept-Slope Cov. 
Icept-Slope Cov. 
Icept-Slope Cov. 


Slope Var. 
Slope Var. 
Slope Var. 
Slope Var. 
Residual Var. 
Residual Var. 
Residual Var. 
Residual Var. 
Intercept 
Intercept 
Intercept 
Intercept 
Level-1 Slope 
Level-1 Slope 
Level-1 Slope 
Level-1 Slope 
Level-2 Slope 
Level-2 Slope 
Level-2 Slope 
Level-2 Slope 
Intercept Var. 
Intercept Var. 
Intercept Var. 
Intercept Var. 


Icept-Slope Cov. 
Icept-Slope Cov. 
Icept-Slope Cov. 
Icept-Slope Cov. 


Slope Var. 
Slope Var. 
Slope Var. 
Slope Var. 
Residual Var. 
Residual Var. 
Residual Var. 
Residual Var. 


7.000 
7.000 
7.000 
2.510 
2.510 
2.510 
2.510 
10.000 
10.000 
10.000 
10.000 
72.000 
72.000 
72.000 
72.000 
50.000 
50.000 
50.000 
50.000 
3.162 
3.162 
3162 
3.162 
1.664 
1.664 
1.664 
1.664 
35.000 
35.000 
35.000 
35.000 
5.612 
5.612 
5.612 
5.612 
10.000 
10.000 
10.000 
10.000 
40.000 
40.000 
40.000 
40.000 


6.278 
6.772 
6.723 
1931 
2.391 
2.328 
2.466 
10.197 
9.782 
10.187 
10.042 
71.709 
71.959 
71.767 
72.022 
49.908 
49.967 
49.975 
49.946 
3.068 
3.147 
Blot 
SASL 
1.547 
1.566 
1.617 
1.590 
32.774 
32.426 
34.396 
34.264 
4.946 
5.183 
5.383 
5.487 
9.540 
9.825 
9.905 
10.036 
40.105 
39.977 
39.975 
39.962 


-10.311 
-3.255 
-3,959 

-31.027 
-4.745 
-7.255 
-1.772 

L975 
-2.177 

1.869 

0.423 
-0.404 
-0.057 
-0.323 

0.030 
-0.184 
-0.066 
-0.050 
-0.107 
-2.987 
-0.472 
-0.803 
-0.164 
-7.041 
-5.881 
-2.779 
-4.403 
-6.359 
-7.354 
-1.726 
-2.102 

-11.883 
-7.657 
-4.090 
-2.242 
-4.597 
-1.755 
-0.950 

0.356 
0.262 
-0.058 
-0.061 
-0.094 


J=30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n=30 
J = 100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n=30 
J = 100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n=30 
J = 100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n=30 
J = 100,n= 10 


0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 


25% Normal 
25% Normal 
25% Normal 
25% Normal 
25% Normal 
25% Normal 
25% Normal 
25% Normal 
25% Normal 
25% Normal 
25% Normal 
25% Normal 
25% Normal 
25% Normal 
25% Normal 
25% Normal 
25% Normal 
25% Normal 
25% Normal 
25% Normal 
25% Normal 
25% Normal 
25% Normal 
25% Normal 
25% Normal 
25% Normal 
25% Normal 
25% Normal 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 


Intercept 

Intercept 

Intercept 

Intercept 

Level-1 Slope 
Level-1 Slope 
Level-1 Slope 
Level-1 Slope 
Level-2 Slope 
Level-2 Slope 
Level-2 Slope 
Level-2 Slope 
Intercept Var. 
Intercept Var. 
Intercept Var. 
Intercept Var. 


Icept-Slope Cov. 
Icept-Slope Cov. 
Icept-Slope Cov. 
Icept-Slope Cov. 


Slope Var. 
Slope Var. 
Slope Var. 
Slope Var. 
Residual Var. 
Residual Var. 
Residual Var. 
Residual Var. 
Intercept 
Intercept 
Intercept 
Intercept 
Level-1 Slope 
Level-1 Slope 
Level-1 Slope 
Level-1 Slope 
Level-2 Slope 
Level-2 Slope 
Level-2 Slope 
Level-2 Slope 
Intercept Var. 
Intercept Var. 
Intercept Var. 


50.000 
50.000 
50.000 
50.000 
3.162 
3.162 
3.162 
3.162 
1.664 
1.664 
1.664 
1.664 
35.000 
35.000 
35.000 
35.000 
5.612 
5.612 
5.612 
5.612 
10.000 
10.000 
10.000 
10.000 
40.000 
40.000 
40.000 
40.000 
50.000 
50.000 
50.000 
50.000 
S162 
3162 
plz 
3.162 
0.744 
0.744 
0.744 
0.744 
7.000 
7.000 
7.000 


49.931 
49.900 
49.974 
49.971 
2.995 
3.147 
3.086 
3.148 
1.488 
1.365 
1.561 
1.559 
32.569 
32.632 
34.156 
34.105 
4.898 
4.926 
5.244 
5.452 
10.187 
9.699 
10.202 
10.050 
39.954 
39.990 
39.986 
40.005 
49.954 
A0 972 
49.988 
50.002 
3.095 
3.145 
3.150 
3.165 
0.714 
0.681 
0.719 
0.722 
6.481 
6.506 
6.695 


-0.138 
-0.200 
-0.052 
-0.057 
-5.278 
-0.468 
-2.413 
-0.457 
10.585 
17.924 
-6.181 
-6.322 
-6.945 
-6.765 
-2.413 
-2.557 
12.724 
12.228 
-6.572 
-2.851 

1.871 
-3.014 

2.022 

0.501 
-0.116 
-0.025 
-0.035 

0.013 
-0.092 
-0.056 
-0.024 

0.005 
-2.142 
-0.547 
-0.396 

0.090 
-3,995 
-8.427 
-3,371 
-3.001 
-7.420 
-7.051 
-4.351 


J = 100, n = 30 
J=30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100,n = 30 
J=30,n=10 
J = 30, n= 30 
J=100,n=10 
J = 100,n = 30 
J=30,n=10 
J = 30, n= 30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J =30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J =30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J =30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J =30,n=10 
J = 30,n = 30 


0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.5 
0.5 


15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
15% Skewed 
15% Skewed 


Intercept Var. 


Icept-Slope Cov. 
Icept-Slope Cov. 
Icept-Slope Cov. 
Icept-Slope Cov. 


Slope Var. 
Slope Var. 
Slope Var. 
Slope Var. 
Residual Var. 
Residual Var. 
Residual Var. 
Residual Var. 
Intercept 
Intercept 
Intercept 
Intercept 
Level-1 Slope 
Level-1 Slope 
Level-1 Slope 
Level-1 Slope 
Level-2 Slope 
Level-2 Slope 
Level-2 Slope 
Level-2 Slope 
Intercept Var. 
Intercept Var. 
Intercept Var. 
Intercept Var. 


Icept-Slope Cov. 
Icept-Slope Cov. 
Icept-Slope Cov. 
Icept-Slope Cov. 


Slope Var. 
Slope Var. 
Slope Var. 
Slope Var. 
Residual Var. 
Residual Var. 
Residual Var. 
Residual Var. 
Intercept 
Intercept 


7.000 
2.510 
2.510 
2.510 
2.510 
10.000 
10.000 
10.000 
10.000 
72.000 
72.000 
72.000 
72.000 
50.000 
50.000 
50.000 
50.000 
3.162 
3.162 
3.162 
3.162 
0.744 
0.744 
0.744 
0.744 
7.000 
7.000 
7.000 
7.000 
2.510 
2.510 
2.510 
2.510 
10.000 
10.000 
10.000 
10.000 
72.000 
72.000 
72.000 
72.000 
50.000 
50.000 


6.860 
2.220 
2.400 
Zole 
2.449 
9.936 
Q/27 
10.136 
9.826 
71.119 
72.040 
71.994 
72.156 
50.003 
49.950 
49.954 
50.008 
3.024 
3.157 
3.132 
3.169 
0.606 
0.615 
0.663 
0.689 
6.418 
6.443 
6.690 
6.783 
2.076 
2.402 
2.380 
2.496 
10.390 
90723 
10.016 
9.853 
71.625 
71.989 
72.067 
71.985 
49.946 
49.972 


-2.005 
-11.551 
-4.376 
0.094 
-2.443 
-0.645 
-2.733 
1.357 
-1.744 
-1.224 
0.055 
-0.009 
0.217 
0.007 
-0.100 
-0.091 
0.015 
-4.358 
-0.157 
-0.949 
0.207 
-18.528 
-17.334 
-10.841 
-7.381 
-8.309 
-7.963 
-4.434 
-3.107 
-17.299 
-4.321 
-5.166 
-0.549 
3.900 
-2.767 
0.161 
-1.465 
-0.521 
-0.016 
0.094 
-0.020 
-0.108 
-0.057 


J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J =30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J =30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30, n= 30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J =30,n=10 


0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 


15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
15% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 


Intercept 
Intercept 
Level-1 Slope 
Level-1 Slope 
Level-1 Slope 
Level-1 Slope 
Level-2 Slope 
Level-2 Slope 
Level-2 Slope 
Level-2 Slope 
Intercept Var. 
Intercept Var. 
Intercept Var. 
Intercept Var. 
Icept-Slope Cov. 
Icept-Slope Cov. 
Icept-Slope Cov. 
Icept-Slope Cov. 
Slope Var. 
Slope Var. 
Slope Var. 
Slope Var. 
Residual Var. 
Residual Var. 
Residual Var. 
Residual Var. 
Intercept 
Intercept 
Intercept 
Intercept 

Level-1 Slope 
Level-1 Slope 
Level-1 Slope 
Level-1 Slope 
Level-2 Slope 
Level-2 Slope 
Level-2 Slope 
Level-2 Slope 
Intercept Var. 
Intercept Var. 
Intercept Var. 
Intercept Var. 
Icept-Slope Cov. 


50.000 
50.000 
3.162 
3.162 
3.162 
3.162 
1.664 
1.664 
1.664 
1.664 
35.000 
35.000 
35.000 
35.000 
5.612 
5.612 
5.612 
5.612 
10.000 
10.000 
10.000 
10.000 
40.000 
40.000 
40.000 
40.000 
50.000 
50.000 
50.000 
50.000 
3.162 
3.162 
3.162 
3.162 
1.664 
1.664 
1.664 
1.664 
35.000 
35.000 
35.000 
35.000 
5.612 


49.976 
50.028 
3.092 
3.136 
3.123 
3.161 
1.547 
1.539 
1.539 
1.629 
32.958 
33.120 
34.516 
34.600 
4.952 
5.426 
3200 
5.508 
9.946 
9.868 
9.700 
9.899 
39.917 
39902 
39.964 
40.029 
49.905 
49.981 
50.010 
49.997 
207 
3.139 
3.106 
3.175 
1.444 
1.470 
1.577 
1.554 
33.290 
32.789 
34.758 
34.776 
4.649 


-0.048 
0.057 
-2.217 
-0.818 
-1.229 
-0.055 
-7.000 
-7.523 
-7.473 
-2.071 
-5.835 
-5.372 
-1.383 
-1.142 
-11.771 
-3.317 
-4.572 
-1.863 
-0.545 
-1.319 
-2.999 
-1.013 
-0.208 
-0.246 
-0.090 
0.072 
-0.189 
-0.038 
0.020 
-0.005 
-3.964 
-0.742 
-1.764 
0.390 
-13.215 
-11.620 
-5.215 
-6.585 
-4.887 
-6.318 
-0.692 
-0.639 
-17.173 


J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J =30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 


0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 


25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 
25% Skewed 


Icept-Slope Cov. 
Icept-Slope Cov. 
Icept-Slope Cov. 


Slope Var. 
Slope Var. 
Slope Var. 
Slope Var. 
Residual Var. 
Residual Var. 
Residual Var. 
Residual Var. 


5.612 

5.612 

5.612 
10.000 
10.000 
10.000 
10.000 
40.000 
40.000 
40.000 
40.000 


5.119 
5.254 
5.413 
10.455 
10.021 
10.010 
9.833 
39.811 
39.910 
39.915 
39.967 


-8.789 
-6.384 
-3.555 

4.549 

0.212 

0.102 
-1.665 
-0.472 
-0.224 
-0.213 
-0.083 


Simulation 2: Average Estimates and Bias Values for Model-Based Imputation 


N 


J=30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100,n = 30 
J =30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J =30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J =30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 


ICC Missing % 


0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 


15% Intercept 

15% Intercept 

15% Intercept 

15% Intercept 

15% Level-1 Slope 
15% Level-1 Slope 
15% Level-1 Slope 
15% Level-1 Slope 
15% Level-2 Slope 
15% Level-2 Slope 
15% Level-2 Slope 
15% Level-2 Slope 
15% Intercept Var. 
15% Intercept Var. 
15% Intercept Var. 
15% Intercept Var. 


15% Icept-Slope Cov. 
15% Icept-Slope Cov. 
15% Icept-Slope Cov. 
15% Icept-Slope Cov. 


15% Slope Var. 
15% Slope Var. 
15% Slope Var. 
15% Slope Var. 
15% Residual Var. 
15% Residual Var. 
15% Residual Var. 
15% Residual Var. 
25% Intercept 

25% Intercept 

25% Intercept 

25% Intercept 

25% Level-1 Slope 
25% Level-1 Slope 
25% Level-1 Slope 
25% Level-1 Slope 
25% Level-2 Slope 
25% Level-2 Slope 
25% Level-2 Slope 
25% Level-2 Slope 
25% Intercept Var. 


50.000 
50.000 
50.000 
50.000 
3.162 
3.162 
3.162 
3.162 
0.861 
0.861 
0.861 
0.861 
7.556 
7.556 
7.556 
7.556 
2.608 
2.608 
2.608 
2.608 
10.000 
10.000 
10.000 
10.000 
72.000 
72.000 
72.000 
72.000 
50.000 
50.000 
50.000 
50.000 
3.162 
3.162 
3.162 
3.162 
0.861 
0.861 
0.861 
0.861 
7.556 


Parameter rue Value Avg. Est. 


50.073 
50.044 
50.034 
50.006 
3.052 
3.111 
3.108 
3.153 
0.711 
0.739 
0.797 
0.816 
6.839 
6.942 
7.238 
7421 
2.071 
2.476 
2.498 
2.999 
9.824 
9.734 
9,921 
9.778 
71.452 
71.900 
72.016 
71.873 
50.045 
50.008 
50.001 
50.015 
3.084 
S37 
3.099 
3.162 
0.704 
0.759 
0.840 
0.815 
6.756 


Bias 
0.147 
0.088 
0.068 
0.012 

-3.481 
-1.611 
-1.710 
-0.296 
-17.403 
-14.163 
-7.487 
-5.265 
-9.495 
-8.130 
-4,207 
-1.783 
-20.583 
-5.073 
-4 223 
-0.333 
-1.761 
-2.660 
-0.785 
-2.222 
-0.761 
-0.139 
0.022 
-0.176 
0.090 
0.016 
0.002 
0.030 
-2.475 
-0.788 
-1.989 
-0.002 
-18.248 
-11.823 
-2.514 
-5.358 
-10.593 


J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100,n = 30 
J=30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n = 30 
J=100,n=10 
J = 100, n = 30 
J =30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100,n = 30 


0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 


25% Intercept Var. 
25% Intercept Var. 
25% Intercept Var. 


25% Icept-Slope Cov. 
25% Icept-Slope Cov. 
25% Icept-Slope Cov. 
25% Icept-Slope Cov. 


25% Slope Var. 
25% Slope Var. 
25% Slope Var. 
25% Slope Var. 
25% Residual Var. 
25% Residual Var. 
25% Residual Var. 
25% Residual Var. 
15% Intercept 

15% Intercept 

15% Intercept 

15% Intercept 

15% Level-1 Slope 
15% Level-1 Slope 
15% Level-1 Slope 
15% Level-1 Slope 
15% Level-2 Slope 
15% Level-2 Slope 
15% Level-2 Slope 
15% Level-2 Slope 
15% Intercept Var. 
15% Intercept Var. 
15% Intercept Var. 
15% Intercept Var. 


15% Icept-Slope Cov. 
15% Icept-Slope Cov. 
15% Icept-Slope Cov. 
15% Icept-Slope Cov. 


15% Slope Var. 
15% Slope Var. 
15% Slope Var. 
15% Slope Var. 
15% Residual Var. 
15% Residual Var. 
15% Residual Var. 
15% Residual Var. 


7.556 
7.556 
7.556 
2.608 
2.608 
2.608 
2.608 
10.000 
10.000 
10.000 
10.000 
72.000 
72.000 
72.000 
72.000 
50.000 
50.000 
50.000 
50.000 
3.162 
3.162 
3.162 
3.162 
1.926 
1.926 
1.926 
1.926 
37.781 
37.781 
37.781 
37.781 
5.831 
5.831 
5.831 
5.831 
10.000 
10.000 
10.000 
10.000 
40.000 
40.000 
40.000 
40.000 


6.822 
7.160 
7.338 
2.260 
2023 
2.438 
2.587 
10.530 
9.860 
9.960 
10.077 
71.558 
71.715 
71.942 
71.920 
49.956 
50.009 
50.043 
50.041 
3.071 
3.174 
3.148 
3.169 
1.865 
1.845 
1.816 
15799 
34.949 
35.002 
36.671 
36.730 
4.883 
5.555 
5.619 
S113 
9.847 
9.691 
9.998 
9979 
40.050 
39.824 
39.987 
40.015 


-9.718 
-5.245 
-2.886 
-13.340 
-3.256 
-6.501 
-0.809 
5.300 
-1.404 
-0.396 
0.774 
-0.613 
-0.396 
-0.081 
-0.112 
-0.088 
0.017 
0.085 
0.083 
-2.880 
0.362 
-0.448 
0.224 
-3.162 
-4,207 
-5.686 
-6.596 
-7.495 
-7.355 
-2.939 
-2.784 
-16.255 
-4.736 
-3.635 
-2.020 
-1.535 
-3.091 
-0.020 
-0.210 
0.125 
-0.441 
-0.032 
0.038 


J=30,n=10 
J = 30,n=30 
J = 100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n=30 
J = 100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n=30 
J=100,n=10 
J = 100, n = 30 
J=30,n=10 
J = 30,n=30 
J = 100,n=10 
J = 100, n = 30 


0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 


25% Intercept 

25% Intercept 

25% Intercept 

25% Intercept 

25% Level-1 Slope 
25% Level-1 Slope 
25% Level-1 Slope 
25% Level-1 Slope 
25% Level-2 Slope 
25% Level-2 Slope 
25% Level-2 Slope 
25% Level-2 Slope 
25% Intercept Var. 
25% Intercept Var. 
25% Intercept Var. 
25% Intercept Var. 


25% Icept-Slope Cov. 
25% Icept-Slope Cov. 
25% Icept-Slope Cov. 
25% Icept-Slope Cov. 


25% Slope Var. 
25% Slope Var. 
25% Slope Var. 
25% Slope Var. 
25% Residual Var. 
25% Residual Var. 
25% Residual Var. 
25% Residual Var. 


50.000 
50.000 
50.000 
50.000 
3.162 
3.162 
3.162 
3.162 
1.926 
1.926 
1.926 
1.926 
37.781 
37.781 
37.781 
37.781 
5.831 
5.831 
5.831 
5.831 
10.000 
10.000 
10.000 
10.000 
40.000 
40.000 
40.000 
40.000 


50.068 
50.045 
50.052 
49.984 
3.061 
SAY 
3.123 
3.107 
1.568 
1.581 
1.756 
1.832 
34.805 
34.800 
36.606 
37.031 
4.736 
D027 
5.582 
5.624 
10.240 
9.891 
10.004 
9.936 
40.025 
39.928 
40.076 
40.026 


0.135 
0.090 
0.105 
-0.031 
-3.216 
-1.358 
-1.241 
-1.741 
-18.571 
-17.917 
-8.829 
-4.865 
-7.877 
-7.890 
-3.110 
-1.984 
-18.778 
-8.647 
-4.269 
-3.561 
2.400 
-1.088 
0.037 
-0.639 
0.062 
-0.181 
0.190 
0.064 


Simulation 3: Average Estimates and Bias Values for Model-Based Imputation 


N 


J =30,n=10 
J =30,n=10 
J =30,n=10 
J =30,n=10 
J =30,n=10 
J=30,n=10 
J =30,n=10 
J =30,n=10 
J=30,n=10 
J=30,n=10 
J=30,n=10 
J=30,n=10 
J =30,n=10 
J =30,n=10 
J =30,n=10 
J=30,n=10 
J =30,n=10 
J=30,n=10 
J=30,n=10 
J =30,n=10 
J=30,n=10 
J =30,n=10 
J =30,n=10 
J=30,n=10 
J=30,n=10 
J=30,n=10 
J =30,n=10 
J=30,n=10 
J =30,n=10 
J =30,n=10 
J =30,n=10 
J=30,n=10 
J =30,n=10 
J =30,n=10 
J=30,n=10 
J =30,n=10 
J=30,n=10 
J=30,n=10 
J =30,n=10 
J =30,n=10 
J =30,n=10 


ICC Missing % 


0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.5 
0.5 
0.5 
0.5 
0.5 


15% Intercept 

15% Level-1 Slope 

15% Level-2 Slope 

15% Level-3 Slope 

15% Interaction Slope 

15% Level-3 Intercept Var. 


15% Level-3 Icept-Slope Cov. 


15% Level-3 Slope Var. 
15% Level-2 Intercept Var. 


15% Level-2 Icept-Slope Cov. 


15% Level-2 Slope Var. 
15% Residual Var. 

15% Intercept 

15% Level-1 Slope 

15% Level-2 Slope 

15% Level-3 Slope 

15% Interaction Slope 

15% Level-3 Intercept Var. 


15% Level-3 Icept-Slope Cov. 


15% Level-3 Slope Var. 
15% Level-2 Intercept Var. 


15% Level-2 Icept-Slope Cov. 


15% Level-2 Slope Var. 
15% Residual Var. 

25% Intercept 

25% Level-1 Slope 

25% Level-2 Slope 

25% Level-3 Slope 

25% Interaction Slope 
25% Level-3 Intercept Var. 


25% Level-3 Icept-Slope Cov. 


25% Level-3 Slope Var. 
25% Level-2 Intercept Var. 


25% Level-2 Icept-Slope Cov. 


25% Level-2 Slope Var. 
25% Residual Var. 
25% Intercept 

25% Level-1 Slope 
25% Level-2 Slope 
25% Level-3 Slope 
25% Interaction Slope 


49 836 
3.098 
0.724 
0.654 
1.549 
5.104 
1.917 
8.000 
5.500 
1.990 
8.000 

52.000 

49.740 
2.449 
1.145 
0.938 
L225 

11.183 
2.243 
5.000 

13.750 
2.487 
5.000 

32.500 

49 836 
3.098 
0.724 
0.654 
1.549 
5.104 
1.917 
8.000 
5.500 
1.990 
8.000 

52.000 

49.740 
2.449 
1.145 
0.938 
1.225 


Parameter True Value Avg. Est. 


49.819 
3.099 
0.706 
0.630 
1.436 
ADD) 
1.893 
7.436 
5.487 
2.015 
8.096 

51.989 

49.723 
2.437 
1.132 
0.857 
1.138 

10.208 
1.985 
4.618 

13.649 
Zoo 
5.113 

32.548 

49.818 
3.089 
0.700 
0.513 
1.367 
4.647 
1.933 
7.560 
5.450 
1.983 
8.219 

51.964 

49.694 
2.403 
1.116 
0.745 
1.057 


Bias 
-0.033 
0.023 
-2.495 
-3.644 
-7.276 
-10.711 
-1.231 
-7.055 
-0.236 
1.271 
1.205 
-0.020 
-0.034 
-0.505 
-1.098 
-8.628 
-7.066 
-8.718 
-11.523 
-7.641 
-0.732 
-5.418 
2.261 
0.148 
-0.036 
-0.297 
-3.341 
-21.541 
-11.749 
-8.962 
0.828 
-5.503 
-0.908 
-0.358 
2.738 
-0.070 
-0.092 
-1.886 
-2.517 
-20.550 
- 13.668 


J=30,n=10 
J=30,n=10 
J=30,n=10 
J=30,n=10 
J=30,n=10 
J =30,n=10 
J=30,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 
J=100,n=10 


0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 


25% Level-3 Intercept Var. 


25% Level-3 Icept-Slope Cov. 


25% Level-3 Slope Var. 
25% Level-2 Intercept Var. 


25% Level-2 Icept-Slope Cov. 


25% Level-2 Slope Var. 
25% Residual Var. 

15% Intercept 

15% Level-1 Slope 

15% Level-2 Slope 

15% Level-3 Slope 

15% Interaction Slope 
15% Level-3 Intercept Var. 


15% Level-3 Icept-Slope Cov. 


15% Level-3 Slope Var. 
15% Level-2 Intercept Var. 


15% Level-2 Icept-Slope Cov. 


15% Level-2 Slope Var. 
15% Residual Var. 

15% Intercept 

15% Level-1 Slope 

15% Level-2 Slope 

15% Level-3 Slope 

15% Interaction Slope 

15% Level-3 Intercept Var. 


15% Level-3 Icept-Slope Cov. 


15% Level-3 Slope Var. 
15% Level-2 Intercept Var. 


15% Level-2 Icept-Slope Cov. 


15% Level-2 Slope Var. 
15% Residual Var. 

25% Intercept 

25% Level-1 Slope 

25% Level-2 Slope 

25% Level-3 Slope 

25% Interaction Slope 
25% Level-3 Intercept Var. 


25% Level-3 Icept-Slope Cov. 


25% Level-3 Slope Var. 
25% Level-2 Intercept Var. 


25% Level-2 Icept-Slope Cov. 


25% Level-2 Slope Var. 
25% Residual Var. 


11.183 
2243 
5.000 

13.750 
2.487 
5.000 

32.500 

49 836 
3.098 
0.724 
0.654 
1.549 
5.104 
1.917 
8.000 
5.500 
1.990 
8.000 

52.000 

49.740 
2.449 
1.145 
0.938 
1.225 

11.183 
2.243 
5.000 

13.750 
2.487 
5.000 

32.500 

49 836 
3.098 
0.724 
0.654 
1.549 
5.104 
1.917 
8.000 
5.500 
1.990 
8.000 

52.000 


10.546 
pa Gs ¥ 
4.748 

13.852 
2213 
5.124 

32.570 

49.835 
3.099 
0.726 
0.620 
1.490 
4.914 
1.903 
7.817 
5.453 
1.981 
7.970 

52.039 

49.754 
2.463 
1.143 
0.876 
1.179 

10.995 
2.230 
4.922 

13.697 
2.440 
5.076 

32.497 

49.836 
3.096 
0.709 
0.608 
1.462 
4.985 
2.002 
8.017 
5.500 
1.941 
8.065 

52.030 


-5.694 
-3.832 
-5.038 
0.739 
-8.625 
2.473 
0.217 
-0.002 
0.017 
0.322 
-5.249 
-3.808 
-3.714 
-0.730 
-2.286 
-0.863 
-0.432 
-0.370 
0.075 
0.027 
0.550 
-0.154 
-6.664 
-3.423 
-1.674 
-0.605 
-1.552 
-0.383 
-1.919 
1.516 
-0.008 
0.001 
-0.080 
-2.116 
-7.039 
-5.605 
-2.342 
4.454 
0.208 
0.002 
-2.459 
0.818 
0.058 


J=100,n=10 
J=100,n=10 
J = 100,n=10 
J = 100,n= 10 
J = 100,n=10 
J = 100,n=10 
J=100,n=10 
J = 100,n=10 
J = 100,n= 10 
J = 100,n=10 
J = 100,n=10 
J=100,n=10 
J = 30,n=30 
J = 30,n = 30 
J = 30,n=30 
J = 30,n=30 
J = 30,n=30 
J = 30,n=30 
J = 30,n=30 
J=30,n=30 
J=30,n=30 
J = 30,n=30 
J = 30,n=30 
J = 30,n=30 
J = 30,n=30 
J = 30,n=30 
J = 30,n=30 
J = 30,n = 30 
J = 30,n=30 
J = 30,n=30 
J=30,n=30 
J = 30,n=30 
J = 30,n=30 
J = 30,n=30 
J=30,n=30 
J = 30,n=30 
J = 30,n=30 
J = 30,n=30 
J = 30,n=30 
J = 30,n=30 
J = 30,n=30 
J = 30,n=30 
J = 30,n=30 


0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.5 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 
0.1 


25% Intercept 

25% Level-1 Slope 

25% Level-2 Slope 

25% Level-3 Slope 

25% Interaction Slope 
25% Level-3 Intercept Var. 


25% Level-3 Icept-Slope Cov. 


25% Level-3 Slope Var. 
25% Level-2 Intercept Var. 


25% Level-2 Icept-Slope Cov. 


25% Level-2 Slope Var. 
25% Residual Var. 

15% Intercept 

15% Level-1 Slope 

15% Level-2 Slope 

15% Level-3 Slope 

15% Interaction Slope 
15% Level-3 Intercept Var. 


15% Level-3 Icept-Slope Cov. 


15% Level-3 Slope Var. 
15% Level-2 Intercept Var. 


15% Level-2 Icept-Slope Cov. 


15% Level-2 Slope Var. 
15% Residual Var. 

15% Intercept 

15% Level-1 Slope 

15% Level-2 Slope 

15% Level-3 Slope 

0.15 Interaction Slope 
0.15 Level-3 Intercept Var. 


0.15 Level-3 Icept-Slope Cov. 


0.15 Level-3 Slope Var. 
0.15 Level-2 Intercept Var. 


0.15 Level-2 Icept-Slope Cov. 


0.15 Level-2 Slope Var. 
0.15 Residual Var. 

0.25 Intercept 

0.25 Level-1 Slope 

0.25 Level-2 Slope 

0.25 Level-3 Slope 

0.25 Interaction Slope 
0.25 Level-3 Intercept Var. 


0.25 Level-3 Icept-Slope Cov. 


49.740 
2.449 
1.145 
0.938 
1.225 

11.183 
2.243 
5.000 

13.750 
2.487 
5.000 

32.500 

49 836 
3.098 
0.724 
0.654 
1.549 
5.104 
1.917 
8.000 
5.500 
1.990 
8.000 

52.000 

49.740 
2.449 
1.145 
0.938 
1.225 

11.183 
2.243 
5.000 

13.750 
2.487 
5.000 

32.500 

49 836 
3.098 
0.724 
0.654 
1.549 
5.104 
1.917 


49.749 
2.430 
1.121 
0.850 
1.136 

11.137 
2.198 
5.038 

13.850 
2.413 
5.119 

32.508 

49.837 
3.102 
0.713 
0.591 
1.449 
4.629 
1.843 
7447 
5.498 
2.001 
7999 

52.053 

49.798 
2Aa9 
1,139 
0.880 
1.083 

10.421 
2.148 
4.444 

14.550 
2.053 
Rife | 

32.533 

49.813 
3.116 
0.695 
0.565 
1.347 
4.630 
1.870 


0.018 
-0.777 
-2.058 
-9 458 
-7.220 
-0.408 
-2.010 

0.769 

0.729 
-2.999 

2.381 

0.026 

0.002 

0.123 
-1.575 
-9.673 
-6.463 
-9 300 
-3.881 
-6.908 
-0.035 

0.550 
-0.057 

0.101 
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Interval Coverage: Normal Distribution, 25% Missingness 
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Simulation 3 


Interval Coverage: 15% Missingness 
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Simulation 3 
Interval Coverage: 25% Missingness 
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